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ACTIVAT??? OR INITIAT??? OR START??? OR ACTUAT??? OR BOOT???) 

57 24 S6 ( ION) S5 ( 5N) S3 ( ION) (SI OR S2) 

58 33 (S6 (20N) S5 (5N) S3 (20N) (SI ORS2)) NOT S7 

59 126 (S6 (50N) S5 (5N) S3 (SON) (SI OR S2) ) NOT (S7 0RS8) 

510 79 S9 AND IC=(G06F) 

511 229 ( S6 ( 100N) S5 ( 5N) S3 ( 100N) (SI OR S2)) NOT (S7:S8 OR S10) 

512 9 Sll AND (IC=(G06F-017/00) OR IC= (GO 6F-007 /00 ) OR IC-(G06F- 

017/30) ) 

513 22 (Sll AND ( IC= (G06F-017 ? ) OR IC= (G06F-007 ? ) ) ) NOT (S7:S8 OR 

S10 OR S12) 
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54 45255 (NUMBER OR COUNT???) (3W) (OCCURRENCES OR TIMES) 
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PUB. NO.: 10-271014 [DP 10271014 A] 

PUBLISHED: October 09, 1998 ( 19981009) 

INVENTOR(S) : NISHIO FUMIYOSHI 
HIRATA TAKASHI 

APPLICANT(s) : TAMURA ELECTRIC WORKS LTD [350937] (A Japanese Company or 

Corporation), JP (Japan) 
APPL. NO.: 09-075499 [JP 9775499] 
FILED: March 27, 1997 (19970327) 

INTL CLASS: [6] H03M-007/42 

JAPIO CLASS: 42.4 (ELECTRONICS -- Basic Circuits) 

ABSTRACT 

PROBLEM TO BE SOLVED: To reduce a retrieval time of a dictionary by 
decreasing number of reference times of a route node in the dictionary in 
the case of data compression. 

SOLUTION: The method is provided with a dictionary where a head character 
code of a received character string is registered to an address of a route 
node denoting a retrieval path of succeeding characters of the character 
string, and in the case of compression conversion from the head character 
of the input character string according to registered information of the 
dictionary sequentially when the character string is received, a table 
section 142a that stores character codes in the order of number of 
times of incidence of the head character of the input character string 
is provided, and when the character string is received, the head 
character code of the input character string is compared sequentially 
with a character code with many number of times of incidence of the 
table section and an address of the route node of the dictionary is 
selected depending on the comparison result. 
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ELECTRONIC FILING DEVICE AND FILE PROCESSING METHOD 

PUB. NO. : 07-271823 [JP 7271823 A] 
PUBLISHED: October 20, 1995 ( 19951020) 
INVENTOR(s): KIZAKI SHIGEKI 

APPLICANT(s): TOSHIBA CORP [000307] (a Japanese Company or Corporation), JP 
(Japan) 

APPL. NO.: 06-065138 [JP 9465138] 

FILED: April 01, 1994 (19940401) 

INTL CLASS: [6] G06F-017/30; GllB-027/00; H04N-001/21 

JAPIO CLASS: 45.4 (INFORMATION PROCESSING — Computer Applications); 42.5 
(ELECTRONICS — Equipment); 44.7 (COMMUNICATION -- Facsimile) 

JAPIO KEYWORD : R011 (LIQUID CRYSTALS) 

ABSTRACT 

PURPOSE: To process filing data in accordance with the number of times of 
matching in each retrieving condition by providing the electronic filing 
device with a control means for processing each file data based on the 
number of times of matching in each retrieving condition obtained by a 
retrieving means . 



constitution: when a retrieving condition and a retrieving range are 



set up, a CPUll executes following file retrieving processing through a 
file processing part 12. Namely the file processing part 12 retrieves 

file data satisfying one of specified retrieving conditions from a file 
storing part 14. when a retrieving condition for a certain file data 

matches with the specified condition, the processing part 12 stores the 
retrieved result in a previously prepared retrieved result table 13. in 
this case, when the number of times of matching in each file to be 
retrieved with each retrieving condition or the number of times of 

matching in a character, a code, etc., are specified in the retrieving 
condition, the number of times of matching is stored in a file 
recording part 14 as a retrieved result. Each file data are retrieved or 
sorted based on the number of times of matching . 
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DATA SAVE AREA ESTIMATION PROCESSING DEVICE 



PUB. NO.: 05-053824 [JP 5053824 A] 
PUBLISHED: March 05, 1993 ( 19930305) 
INVENTOR(S) : KIMURA YUKIHIRO 

applicant(s) : FUJITSU LTD [000522] (A Japanese Company or Corporation), JP 
(Japan) 

APPL. NO.: 03-214022 [JP 91214022] 
FILED: August 27, 1991 (19910827) 

INTL CLASS: [5] G06F-009/45 

JAPIO CLASS: 45.1 (INFORMATION PROCESSING -- Arithmetic Sequence Units) 
JOURNAL: Section: P, Section No. 1570, Vol. 17, No. 363, Pg. 96, July 

08, 1993 (19930708) 

ABSTRACT 

PURPOSE: to obtain the data save area estimation processing device for 
estimating an effective required amount concerning a save area for 
allocating a register in the case of a register depending on the compiler 
of a computer. 

CONSTITUTION: when there is an intermediate instruction in an intermediate 
text 4 so as to define a register name as an operand, a control time number 
processing part 1 retrieves a register name list 5 for the relevant 
register name prepared by the compiler and when the register name is a 
defining operand, a processing is requested to an extended register name 
list setting part 2. when the register name is a reference operand, a 
value '1' is subtracted from the number of times for control provided 
for a relevant extended register name list 6, and when the number of 
times for control is '0' or there is no control, the extended register 
name list setting part 2 makes the extended register name list to be 
newly provided correspondent to the designated register name list 5 and 
copies the number of times for referring to the register name list to 
the number of times for control at the extended register name list 6. 
Then, an area estimation part 3 estimates the save area from the extended 
register name list 6 showing the result of the above-mentioned 
processing. 
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COMMON CONTROL SYSTEM FOR PROGRAM WORK AREA 



PUB. NO.: 04-317136 [JP 4317136 A] 

PUBLISHED: November 09, 1992 ( 19921109) 



INVENTOR (s) : 
APPLICANT(s) 



APPL. NO.: 
FILED: 
INTL CLASS: 
JAPIO CLASS: 
JOURNAL : 



NAGASHIMA MASAYOSHI 
IWATANI MASAMUNE 

NEC CORP [000423] (A Japanese Company or Corporation), JP 
(Japan) 

NEC SOFTWARE ltd [491061] (A Japanese Company or Corporation) 

, JP (Japan) 

03-084919 [JP 9184919] 

April 17, 1991 (19910417) 

[5] G06F-009/46 

45.1 (INFORMATION PROCESSING — Arithmetic Sequence Units) 
Section: P, Section No. 1508, Vol. 17, no. 147, Pg. 29, March 
24, 1993 (19930324) 



ABSTRACT 

PURPOSE: To prevent the load of a central processing unit from considerably 
increasing owing to the supervisory of a state even if multiple programs 
come to a waiting state for using the same common work area by changing a 
supervisory time interval in accordance with the number of times for 
referring to a state storage area. 

constitution: The state storage area 1 storing a flag showing the use 
state of the common work area and the number of reference times 
within prescribed time, a program management area 2 storing the name 
of the program in the middle of standby and time when the state storage 
area 1 is referred to, a supervisory time interval table 3 to which 
relation between the number of reference times and the supervisory time 
interval is registered, a work area state judgement means 4 which refers to 
the state storage area 1, stores reference time in the program management 
area 2 when the common work area is used and temporarily stops the program 
and a clock means 5 which retrieves the program management area 2 at the 
prescribed time interval and starts the program in the middle of standby 
when the standby time of the program in the middle of standby is longer 
than the supervisory time interval concerned of the supervisory time 
interval table 3 are provided. 
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PUB. NO.: 04-021181 [DP 4021181 A] 
PUBLISHED: January 24, 1992 ( 19920124) 
INVENTOR(s) : TSUMURA KAZUHIRO 

APPLlCANT(s) : TOSHIBA CORP [000307] (A Japanese Company or Corporation), jp 
(Japan) 

APPL. NO.: 02-125910 [JP 90125910] 
FILED: May 16, 1990 (19900516) 

INTL CLASS: [5 J G06F-015/40; G06F-009/44; G06F-015/40 

JAPIO CLASS: 45.4 (information PROCESSING — Computer Applications); 23.1 
(ATOMIC POWER -- General); 45.1 (INFORMATION PROCESSING — 
Arithmetic Sequence Units) 
JAPIO KEYWORD :R139 (INFORMATION PROCESSING -- Word Processors) 
JOURNAL: Section: P, Section No. 1346, Vol. 16, no. 182, Pg. 78, April 

30, 1992 (19920430) 

ABSTRACT 

PURPOSE: To shorten a processing time and to expand functions by executing 
the required matching processing of a word constituting an input keyword, 
referring an auxiliary image identification (ID) density map corresponding 
to the matched result and retrieving an image memory developing document 
data. 



CONSTITUTION: A CPU 5a for inputted the keyword of a connection pattern of 
character codes to a computer system 5 executes matching processing for 
adding data such as a different number corresponding to each word to a 
character position corresponding to the end of each word by the number 

of times of processing determined by the maximum number of characters of 
the word forming the keyword . The auxiliary areas of memories 2a to 
2c in which auxiliary images expressing the document positions of the image 
memories 2a to 2c developing retrieving data and knowledge as table 

format documents by ID density are stored are retrieved based upon the 
matched result and the retrieving data and the knowledge of the memories 
2a to 2c are detected in accordance with the retrieved results. Since the 

retrieval is executed by simple algorithm, the knowledge processing time 
can be shortened and the functions can easily by extended. 
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August 22, 1991 ( 19910822) 
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NEC CORP [000423] (A Japanese Company or Corporation), JP 
(Japan) 
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November 19, 1991 (19911119) 



ABSTRACT 

PURPOSE: to prevent the search of abbreviated dialing corresponding to 
the telephone number of a desired opposite party from taking labor and time 
by position-changing the storing position of the telephone number of high 
using frequency and the name of the opposite party to the younger number of 
the abbreviated dial number, and displaying a registered party list in 
the order of the high using frequency. 

constitution: The circuit part of an abbreviated dialing device to register 
and display the using frequency order of registered telephone numbers is 
provided with a registration control circuit 1, a display and selection 
part 2, an operating panel 3, a network control part(NCU) 4, and a line 
part 5. Here, when the telephone number of the opposite party facsimile 
device and the name of the opposite party are stored as corresponding to 
the abbreviated dial number, and the telephone numbers of the respective 
opposite parties are used, the number of times of the use of them is 
counted, and is stored in a corresponding area , and the storing position 
of the opposite party telephone number and the name of the opposite party 
is changed according to the number of times of its using frequency. 
Thus, work to search the telephone number of the opposite party from the 
registered party list can be executed efficiently. 
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Computer Applications) 

Kanji Information Processing) 

. 12, No. 51, Pg. 17, 



PURPOSE: To make it 
(Chi nese character) 
conversion character 
number of retrieval 
according 
kanji data 



ABSTRACT 

possible to make kana (Japanese syllabary) -KANJI 
conversion efficiently by setting a specified 
strings out of KANA letter strings, determining the 
by a number of retrieval determining means 
to the length of the conversion character string, and outputting 
detected by number of times of retrieval of determined number. 



CONSTITUTION: By converting a kana letter string inputted to an inputting 
area 3 to kanji, number of times f of retrieval of kanji according 
to the number of characters of character length M actually converted is 
set. As outputting is made determining priority order out of retrieved F, 
necessary kana-kanji conversion can be made in a short time. A table used 
as a base for operation by a number of times of retrieval arithmetic 
section 6 is set to the number of retrieval data necessary for ordinary 
retrieval of number of characters to be retrieved . 
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PURPOSE: To detect 
order of transaction 
closed route to be i 
CONSTITUTION: Data p 
data control parts 
22, and 32, which 
occurrence time, 
using data B makes a 
the data processor 



ABSTRACT 

a deadlock efficiently by retrieving a loop in the 
s with a longer time of wait and recognizing a detected 
n a deadlock state. 

rocessors 1-3 share a common-use file 4, and respective 
11, 21, and 31 are provided with control tables 12, 
are provided with a transaction (TR) name , wait 
factor, count field , etc. For example, TRAl when 
declaration of occupation at, for example, time 800 in 
1, but the data Bis occupied by TRA2 and a wait state 



is entered. In this case, a data control part 11 records the state in the 
table 12. The control part 11 detects a deadlock (DL) from the TRAl and 
enters the value of a counter indicating the number of times of dl 
detection of Al. Then, loop detection is carried out according to the 
longer wait time of trs among TRs which occupy the data B, and a closed 
loop when detected is judged to be in a deadlock state. Thus, the DL is 
detected efficiently. 
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Multimedia presentation s e.g. news broadcast, representation processing 
method, involves presenting list of named entities and their 
corresponding number of occurrences, and extracting story summary data 
using entities as basis 

Patent Assignee: mitre CORP (mitr-n) 

inventor: maybury m t; merlino a e 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date week 
US 6961954 Bl 20051101 US 9765947 P 19971027 200577 B 

US 9833268 A 19980302 

Priority Applications (no Type Date): US 9765947 P 19971027; US 9833268 A 

19980302 
Patent Details: 

Patent no Kind Lan Pg Main IPC Filing Notes 

US 6961954 Bl 36 G06F-003/00 Provisional application US 9765947 
Abstract (Basic): US 6961954 Bl 

NOVELTY - The method involves selecting a contiguous portion of a 
multimedia presentation as a story segment . Named entities are 
extracted from a text information stream corresponding to the segment 
. A list of named entities and their corresponding number of 
occurrences in the segments over a selected time period are 
presented in response to a search query . story summary data is 
extracted using the entities as a basis. 

USE - Used for processing a representation of a multimedia 
presentation e.g. news broadcast. 

advantage - The method automatically annotates and summarizes the 
multimedia data representative of information in a news broadcast, so 
that it is visualized, searched , and disseminated in a compatible 
manner. The method enables timely and efficient e.g. low bandwidth, 
communication and storage of the multimedia data. 

DESCRIPTION OF DRAWING (S) - The drawing shows a block diagram of an 
automated system for analyzing, selecting, condensing and presenting 
information derived from broadcast news. 

Media source (102) 

Story classifier (133) 

Multimedia database management system (140) 
video and metadata (142) 
File server (160) 
pp; 36 DwgNo 1/22 

Title Terms: PRESENT; NEWS; BROADCAST; REPRESENT; PROCESS; METHOD; PRESENT; 
LIST ; NAME ; ENTITY; CORRESPOND; NUMBER; OCCUR; EXTRACT; SUMMARY; DATA; 
ENTITY; BASIS 
Derwent Class: T01; W01 

international Patent Class (Main): G06F-003/00 

international Patent Class (Additional): G06F-013/00; H04N-005/445 
File Segment: EPl 
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Processing method of electronic documents e.g. patents - involves 
locating number of hit entries that indicate number of times search 
keyword appears in document and location entries that indicate 
occurrences of search keyword in documents 

Patent Assignee: smartpatents inc (smar-n) 

Inventor: AHN D 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Applicat No Kind Date week 
US 93155752 A 19931119 199905 B 
US 94341129 A 19941118 
US 95422528 A 19950414 
US 97905727 A 19970804 



Patent No 
US 5848409 



Kind Date 
A 19981208 



Patent No Kind Lan Pg Main IPC 
US 5848409 A 14 G06F-017/30 



Priority Applications (No Type Date): US 95422528 A 19950414; US 93155752 A 

19931119; US 94341129 A 19941118; US 97905727 A 19970804 
Patent Details: 

Filing Notes 

CIP of application US 93155752 
CIP of application US 94341129 
Cont of application US 95422528 
CIP of patent US 5623681 
Cont of patent US 5696963 

Abstract (Basic): US 5848409 A 

The method involves locating the number of hit entries in a group 
hit table (204) associated with a search keyword. The hit entries 
correspond to documents in which the search keyword appears. Multiple 
location entries are located in a document index table associated 
with the document in which the search keyword appears. 

Each location entry corresponds to different occurrences of the 
search keyword in the documents. The information relating to the 
number of times the keyword appears in the document or the 
portion of the documents containing occurrences of the keyword , are 
presented to the user. 

USE - For books, magazines, articles, journals. 

advantage - Presents information relating to number of times 
search keyword occurs or portions of document containing 
occurrences of keyword to user efficiently. 

Dwg . 2/7 

Title Terms: PROCESS; METHOD; ELECTRONIC; DOCUMENT; PATENT; LOCATE ; 

NUMBER ; HIT; ENTER; INDICATE; NUMBER; TIME; SEARCH ; KEYWORD; APPEAR; 

DOCUMENT; LOCATE ; ENTER; INDICATE; OCCUR; SEARCH ; KEYWORD; DOCUMENT 
Derwent Class: T01 

international Patent Class (Main): G06F-017/30 
File Segment: epi 
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Database search record ranking method - involves assigning weights to 
record index entries according to frequency of information and parsing 
queries associated with index entries 

Patent Assignee: DIGITAL EQUIP CORP (DIGI ) 
inventor: BURROWS M 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 5765150 A 19980609 US 96695905 A 19960809 199830 B 

Priority Applications (No Type Date): US 96695905 A 19960809 
Patent Details: 

Patent No Kind l_an Pg Main IPC Filing Notes 
US 5765150 A 42 G06F-017/30 

Abstract (Basic): US 5765150 A 

The ranking method involves indexing the records of the database by 
storing index entries in a memory to create an index. Each index entry 
includes a word entry representing a unique portion of database 
information and one or more location entries indicating where the 
information occurs in the database 

records. A weight is assigned to each index entry according to a 
relative frequency of occurrence of the information in the database. A 
query is parsed into terms and operators, each term associated with a 
corresponding index entry, index entries are sequentially searched to 

locate database records qualified by the terms and operators of the 
query . 

Each located record is scored according to the number of times 
portions of information corresponding to the terms of the query 
occur in each record and their associated weights. The scores and 
identities of the located records are stored in a ranking list , 
having a predetermined number of entries, in response to searching a 
predetermined fraction of the index, it is determined if any unlocated 
records of the database can receive a score higher than one of the 
records stored in the ranking list based on the index entries 
corresponding to the terms having a lowest weight, if not, the index is 

searched using only using the index entries having weights higher 
than the lowest weight. 

advantage - Maximises search of index query terms likely to 
provide records of interest to users. 

Dwg.2/26 

Title Terms: DATABASE; SEARCH ; RECORD; RANK; METHOD; ASSIGN; WEIGHT; 
RECORD; INDEX; ENTER; ACCORD; FREQUENCY; INFORMATION; PARSE; QUERY ; 
ASSOCIATE; INDEX; ENTER 

Derwent Class: T01 

international Patent Class (Main): G06F-017/30 
File Segment: epi 
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Search record ranking method for database - involves weighting indexed 
database records according to frequency of information occurrence and 
parsing query into operators associated with index entry 
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US 5765149 A 33 G06F-017/30 

Abstract (Basic): US 5765149 A 

The ranking method involves indexing the records of the database by 
storing index entries in memory to create the index. Each index entry 
includes a word entry representing a unique portion of database 
information and one or more location entries indicating where the 
information occurs in the database 

records. A weight is assigned to each index entry according to a 
relative frequency of occurrence of the information portion in the 
database. A query is parsed into terms and operators, each term 
associated with a corresponding index entry, index entries are 
sequentially searched to locate records which are qualified by the 
terms and operators of the query . 

Each located record is scored according to the number of times 
portions of information corresponding to the terms of the query 
occur in each record and their associated weights. The scores and 
identities of the located records are stored in entries of a ranking 
list having a predetermined number of entries, in response to the 
ranking list becoming full, it is determined if any unlocated records 
of the database can receive a score higher than one of the records 
stored using index entries having a lowest weight, and if not, 
searching the index using index entries having weights higher than 
index entries having the lowest weight. 

ADVANTAGE - Presents search results in usable manner relieving 
users of having to pursue all qualifying records. 
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The FIFO data write and read method involves sequentially writing 
data supplied to the input port into the memory in units having the 



same bit length as the longer of the two FIFO fixed and different bit 
length input and output ports. The data is sequentially read from the 
memory through the output port in units in the same order as the data 
was written into the memory, with the unit bit length equal to the FIFO 
output port bit length. 

Pref. the FIFO memory input port receives a short word and an 
output port provides a long word an integer number N times larger 
than the input word bit length, and the FIFO memory areas are 
arranged in a matrix of rows and columns The method pref. involves 
sequentially writing data a series of short words through the input 
port into the memory to consecutively store sets of N snort words in 
rows of consecutive memory areas, and successively reading the stored 
data through the output port as a single longer word. 

ADVANTAGE - Eliminates bus exchanger in e.g. system with two CPUs 
for processing different bit length words; fast dat bit length 
conversion. 
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Set items Description 

51 7419281 TRADEMARK? ? OR TRADE () MARK? ? OR TRADENAME? ? OR NAME? ? - 

OR LOGO OR LOGOS 

52 8896767 KEYWORD? ? OR WORD? ? OR TERM? ? 

53 67300 (Sl:S2 OR MATCH??? OR HIT OR HITS OR RESULT???) (5N) (HIGHLI- 

GHT? OR HILIGHT? OR HILIT??? OR (HI OR HIGH) () (LIT??? OR LIGH- 
T???)) 

54 306908 (Sl:S2 OR MATCH??? OR HIT OR HITS OR RESULT???) (7N) (FREQUE- 

N? OR OCCURR? OR INCIDENCE? ? OR APPEAR?) 

55 19261 S4(20N)(METATAG? ? OR META()TAG? ? OR HIDDEN OR TITLE? ? OR 

HYPERLINK? ? OR LINK? ? OR PARAGRAPH? ? OR SECTION? ? OR POR- 
TION? ? OR AREA? ? OR PIECE? ? OR SEGMENT? ?) 

56 8948666 SEARCH??? OR QUERY??? OR QUERIE? ? OR RETRIEV??? OR FIND??? 

OR DISCOVER??? OR LOCATE? ? OR LOCATING 

57 131 S3(50N)S5(50N)S6 

58 95 RD (unique items) 

59 70 S8 NOT PY=2001:2006 

510 37747 (NUMBER OR COUNT???) (3W) (OCCURRENCES OR TIMES) 

511 638 COUNT???(3N)OCCURRENCE? ? 

512 188 S10:S11(7N)(S1:S2 OR MATCH??? OR HIT OR HITS OR RESULT???) - 

(7N) (METATAG? ? OR META()TAG? ? OR HIDDEN OR TITLE? ? OR HYPE- 
RLINK? ? OR LINK? ? OR PARAGRAPH? ? OR SECTION? ? OR PORTION? 
? OR AREA? ? OR PIECE? ? OR SEGMENT? ?) 



S13 


97 


S6(100N)S12 


S14 


75 


RD (unique items) 


S15 


59 


S14 NOT PY=2001:2006 


S16 


115 


S10:S11(7N)(S1:S2 OR MATCH??? OR HIT OR HITS OR PHRASE? ? - 




OR 


STRING? ?) (7N) (CATEGOR??? OR FIELD? ?) 


S17 


80 


S6(100N)S16 


S18 


47 


RD (unique items) 


S19 


31 


S18 NOT (S15 OR PY=2001:2006) 


S20 


630 


S10:S11(15N)(S1:S2 OR MATCH??? OR HIT OR HITS OR PHRASE? ? 



OR STRING? ?) (15N) (METATAG? ? OR META()TAG? ? OR HIDDEN OR TI- 
TLE? ? OR HYPERLINK? ? OR LINK? ? OR PARAGRAPH? ? OR SECTION? 
? OR PORTION? ? OR AREA? ? OR PIECE? ? OR SEGMENT? ? OR CATEG- 
OR??? OR FIE 

521 181 S20(10N) (COLUMN?? OR GRID? ? OR ARRAY? ? OR TABLE? ? OR LI- 

ST???? OR REPORT???) 

522 136 S6(100N)S21 

523 88 RD (unique items) 

524 41 ' S23 NOT (PY=2001:2006 OR S19 OR S15 OR S9) 
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world Wide web search engines have become the most heavily-used on line 
services, with millions of searches performed each day. Their popularity is 
due, in part, to their ease of use, which stems primarily from their use of 
relevancy searching (also called statistical or fuzzy searching). 

Many search engines support Boolean operators, field searching, and 
other advanced techniques, but with relevancy searching users simply enter 
their terms and click the Search button, while searches may retrieve 
thousands of hits, search engine producers claim their systems place items 
that best match the search query at the top of the results list, in this 
study, we test how five major search engines retrieve and rank documents in 
answer to sample search queries. 

RELEVANCY AND RANKING 

The basic premise of relevancy searching is that results are sorted, 
or ranked, according to certain criteria. Criteria can include the number 
of terms matched, proximity of terms, location of terms within the 
document, frequency of terms (both within the document and within the 
entire database, document length, and other factors. The exact "formula" 
for how these criteria are applied is the "ranking algorithm" and varies 
among search engines. 

in the highly-competitive search engine industry, ranking 
algorithms are closely-guarded company secrets. Most search engine 
producers, however, give a general description of criteria they consider in 
computing a page's ranking "score" and its placement in the results list. 
HotBot, for example, describes term frequency and location as primary 
factors [1], Documents with more occurrences of the search term receive a 
higher weight, but the overall obscurity of the term within the database 
also has an impact, in addition, the number of occurrences relative to the 
document length is considered, and shorter documents are ranked higher than 
a longer document with the same number of occurrences . Terms in the 
title or metatags are weighted higher than terms only within the text. 
AltaVista considers these factors, as well as the number of terms matched 
and the proximity of search terms [2], 

Other search engines provide less information about their ranking 
criteria, but do mention some elements. Infoseek gives extra weight to 
terms in the title and metatags [3]. Lycos considers terms in the title and 
headings, but does not give extra weight to terms in metatags [4] Excite 
does not index terms in metatags. in addition to retrieving documents 
that contain the search term(s) , Excite analyzes the content of the 
documents for related phrases in a process it calls Intelligent Concept 
Extraction (ice) . Thus, a search on "elderly people" may also retrieve 
documents on "senior citizens" [5]. More recently, several search engines 
have begun considering factors such as the number of links made to a page 
or the number of times a page is accessed from a results list [6]. 

IMPACT OF RANKING 

Results ranking has a major impact on users 1 satisfaction with Web 
search engines and their success in retrieving relevant documents. Yet, 
little research has been done in this area. Yuwono and Lee described and 
tested four basic algorithms for ranking Web search results, but they did 
not examine the ranking of results from the major engines [7]. 

A few studies have measured search engines 1 precision, or the 
ability to retrieve relevant results, chu and Rosenthal tested the 



precision of three major search engines by judging the relevance of the 
first ten hits for ten search queries [8]. Alta vista received an 
average score of 0.78, meaning that 78% of items retrieved were judged 
relevant. Scores for Excite (0.45) and infoseek (0.59) were also reported. 

Ding and Marchionini conducted a similar study by examining the first 
20 hits for five search queries and reported scores for infoseek (0.27), 
Lycos (0.43), and Opentext (0.40) [9]. in an unpublished study, Leighton 
and Srivastava tested 15 search queries in five major search engines and 
reported scores for three different levels of relevancy--! inks that 
satisfied the search statement, potentially useful links, and clearly 
useful links [10]. when testing only to see if results satisfied the search 
expression, the following scores were reported: Alta vista (0.90), Excite 
(0.93), HotBot (0.72), infoseek (0.87), Lycos (0.61). 

HOW USERS JUDGE RESULTS LISTS 

These studies share a common design in that each examined and judged 
the relevancy of the first 10 to 15 items retrieved by the search, while 
this is an effective methodology for determining precision, our experience 
shows this is not how users use their results lists. 

Users are much more likely to scan their results list and retrieve 
only selected documents. The user may consider a number of factors in 
deciding whether or not to retrieve a document, but a key factor, as 
Matthew Koll suggests, is the number of terms matched: 

Regardless of relevance-ranking theory, users have an intuitive sense 
of how well the relevance ranking is working, and a key indicator of this 
intuitive satisfaction is the number of distinct query words that a 
document contains. For example, a ranking algorithm should not under any 
circumstances rank a document containing only two query words from an 
eight-word query higher than a document containing all eight words [11]. 

Results lists contain only limited information about the document, 
typically the title and a "summary" that is usually generated from the 
first few lines of text or description metatags. Other information may also 
be provided, such as the date of last update, URL, size of document, etc. 
as the user scans the list for occurrences of query terms, documents that 
contain the terms in the title will be readily apparent, if the user scans 
summaries, terms contained in a header near trie top of the page or in 
description metatags may also be visible. The user can also determine which 
items display search terms in close proximity or as a phrase, when a 
document is retrieved, the browser's Edit/Find function can locate terms 
and phrases in the document. 

CRITERIA FOR TESTING RELEVANCY RANKING 

while a user will typically browse only the first few pages of 
results, the ranking of those results provided by the search engine is 
crucial to the success of the search session--and the user's perception of 
his satisfaction with the results. 

Search engine algorithms consider a number of criteria, but the 
number of terms matched, location of terms in title, headers or metatags, 
and the proximity of terms are most easily assessed by users. Results that 
meet these criteria are most likely to satisfy users intuitive sense of how 
well the relevance ranking is working." 

Other factors, such as frequency of search terms in the entire 
database or frequency of terms in relation to document length, may be part 
of the ranking algorithm, but are more difficult for the user to assess. 
With this premise in mind, we identified three basic tests to judge 
ranking: 

1. All Terms: Are documents that contain all search terms ranked 
higher than documents that do not contain all search terms? 

2. Proximity: For documents that contain all search terms, are 
documents that contain search terms as a contiguous phrase ranked higher 
than documents that do not? 

3. Location: For documents that contain all search terms, are 
documents that contain search terms in the title, headings, or metatags 
ranked higher than documents that contain terms only within the body of the 
document? 

With their overall complexity and proprietary nature, it is 
impossible to consider all the elements of ranking algorithms, instead, our 



intent was to implement tests that would provide a general idea of the 
overall reliability of web search engine ranking. 
METHODOLOGY 

Search engines that scored highly in comparison tests in popular 
computing magazines were selected for the study and include AltaVista, 
HotBot, Excite, infoseek, and Lycos. Northern Light, now one of the major 
search engines, had not achieved wide acclaim at the time of the study. 

To test for the presence of all terms and proximity, we devised 
multiple-term search statements. Twelve phrases were selected, with an 
equal number of two- and three-word phrases. Most topics were taken from 
actual reference questions, although some were taken from previous studies. 
Search topics were equally distributed among the humanities, sciences, and 
social sciences. The following search topics were selected: 

* credit card fraud 

* quantity theory of money 

* liberation tigers 

* evolutionary psychology 

* french and indian war 

* classical greek philosophy 

* beowulf criticism 

* abstract expressionism 

* tilt up concrete 

* latent semantic indexing 

* fm synthesis 

* pyloric stenosis 

Previous studies constructed search statements using Boolean 
operators, +/- modifiers, and enclosed phrases in double quotes. However, 
we choose to enter each statement exactly as shown with no operators, 
modifiers, or quotes. This was done to provide the most rigorous test of 
ranking ability and to approximate the type of searching done by most 
users. A study of over 50,000 searches performed by more than 18,000 Excite 
users by Jansen, et al . , indicated that AND was used in fewer than 7% of 
search statements, and that +/- and double quotes were used in fewer than 
6% of searches [12]. Default settings were used in all search engines. 
Searches were run between April 3 and April 10, 1998. 

For each search, the first 100 items were downloaded. The total 
number of hits for each search was not recorded, but all searches produced 
at least 100 hits. Pen scripts were written to facilitate downloading and 
to analyze the content of each document. Scripts recorded the ranking 
position of each document, and produced a "yes" or "no" score for each of 
the following tests: 

1. All Terms: Does the document contain at least one occurrence of 
all search terms? 

2. Proximity: is there at least one occurrence of all search terms 
appearing as a contiguous phrase? 

3. Location: is there at least one occurrence of all search terms 
appearing within the title, H1-H6 headers, or metatags? 

The case of terms was not considered and all plurals, variant word 
endings, and stemming, e.g., "up" as part of "guppy," were ignored. All 
text in the [less than] header [greater than] and [less than] body [greater 
than] tags was analyzed. 

Text in the author, description, and keyword metatag fields was 
analyzed. Since Excite does not index metatags, terms occurring in these 
fields were not analyzed in the Excite results. For the Location test, all 
terms had to be present in the title, headings, or metatags to receive a 
"yes" score. For example, if one of three search terms was present in the 
metatags and two of the three terms were present in the title, the document 
received a "yes" score. If two of three search terms were present in the 
title, but the third term occurred only in the text, the document received 
a "no" score. Metatag terms were not included in the Location test for 
Lycos, since Lycos does not give additional weight to metatag terms. 

For each document, the Perl scripts produced a report that displayed 
the title and URL, terms contained in the author, description, or 
keyword metatags , terms contained in the H1-H6 headings, total number 
of words in the document, and number of occurrences of each term, in 



addition, each occurrence of a search term was displayed in context (five 
words on either side) along with the numerical position of the term within 
the document. This information was used to manually review each document 
and verify yes/no scores. For example, since some search engines do not 
index certain stopwords, yes/no scores were adjusted manually to allow for 
cases, such as "f rench-indian war," "french & indian war," etc. These 
reports also helped to identify cases where the page could not be 
downloaded. 

For each search , the rank position of the last item that gave a 
positive response to the question being tested was identified. Next, each 
negative response between the first item and the last positive item was 
recorded. This figure was then divided by the ranking position of the last 
item that gave a positive response to the question being tested. For 
example, in the search "credit card fraud" in AltaVista, the document 
retrieved in position #99 was the highest numbered item that produced a 
positive response to the "All Terms" question. Between position #1 and 
position #99, ten negative responses, i.e., documents that did not contain 
all terms, were recorded. The score for this search was calculated as 10/99 
or 10.1%. Lower percentages indicate a "better score", i.e., fewer 
instances where an item that did not satisfy the ranking criterion was 
ranked higher than an item that did meet the criterion. 

Pages that could not be downloaded were given negative scores for all 
criteria, reflecting the most likely perception from an end-user's 
perspective. Someone attempting to retrieve the page and receiving a 404 or 
other error message, for example, would conclude that the document was not 
available and, therefore, did not answer the search statement. 

Jansen's study indicated that 80% of users viewed only the first two 
pages of results [13], This data suggests that few users actually review as 
many as 100 items retrieved by a search, so the same analysis described 
earlier was repeated on the first 20 hits. This would allow comparison 
between the ranking capabilities based on 20 and 100 hits-is the ranking 
more reliable within the first 20 items? 

RESULTS 

For brevity and to facilitate comparison among search engines, the 
Table compares average scores for 100 and 20 hits. 

A score of 0.0% indicates a perfect score, i.e., all items that 
satisfied the criterion were ranked higher than items that did not. 

All Terms: Excite produced the best score of 5.0% for 20 hits. This 
is surprising in light of Excite 1 s ICE feature that allows for retrieval of 
related phrases as well as exact terms in the search statement. It suggests 
that items that contain only iCE-identified related phrases are ranked 
considerably lower than items containing search terms. 

Lycos also performed well, with a score of 5.4% for 20 hits and the 
best score (8.4%) for 100 hits. This performance may be due to Lycos' use 
of and as the default operator. HotBot, which also uses and as the default 
operator, yielded much poorer scores of 12.3% and 19.5% for 20 and 100 
hits, respectively, suggesting that its implementation of the and operator 
is not as effective as that of Lycos. 

Proximity: AltaVista performed best on this test, with scores of 
11.1% and 7.7% for 100 and 20 hits, respectively. It may be that proximity 
is a heavily-weighted component of Alta vista's ranking algorithm, 
particularly in Tight of its implementation in October 1998 of automatic 
phrase searching [14]. infoseek also performed well on this test, with 
scores of 14.5% for 100 hits and 9.5% for 20 hits. Although Lycos did well 
in the All Terms test, it had the poorest scores for Proximity for both 100 
(48.7%) and 20 (26.3%) hits. 

Location: Scores were much worse for this test, suggesting that 
location is not heavily weighted in most algorithms. Given the metatag 
indexing practices of Excite and Lycos, it was impossible to apply this 
test consistently across all search engines. Further, terms in the author 
and keyword metatags are visible only by viewing the html code for the 
page. AltaVista's score of 10.4% for 20 bits was much better than all other 
scores, which ranged from just under 30% to over 70%. AltaVista showed the 
best score of 40.5% for 100 hits. 

The search engines gave better scores when tested for the first 20 



hits as compared to the first 100 hits, in many cases, the difference was 
dramatic. For example, the search "quantity theory of money" produced a 
score of 87% for the All Terms test in AltaVista for the first 100 hits, 
but a perfect score for the first 20 bits. Similar occurrences are evident 
in the "beowulf criticism" search for Location in Excite and "liberation 
tigers" and "fm synthesis" searches for Location in Lycos. For most 
searches, however, improvement in scores ranged from 5-25% and offered some 
evidence that end-users will see better ranking within the first 20 hits. 

There were also a number of cases where better scores where obtained 
for 100 hits, and these can be seen in the Table, in the All Terms test, 
for example, 11 searches among all the search engines yielded better scores 
for 100 hits than for 20 hits. 

All searches retrieved items that could not be downloaded, i.e., 
cases where the document has moved or no longer exists. The percentages of 
invalid links were 2.3% for Excite, 4.0% for AltaVista, 5.1% for Lycos, 
7.7% for infoseek, and 9.3% for HotBot. Pages that could not be downloaded 
were tallied as negative responses to all questions. This is more a measure 
of the freshness or update frequency of the database, but it had an impact 
on ranking scores. A document that contains all search terms may be ranked 
highly in the results list, but if the document is not available for 
viewing, it is of little value to the user, while the archiving feature of 
Alexa may help to provide access to documents no longer available on the 
Web, we felt it was most appropriate to rate invalid links as items that 
did not satisfy the search query. 

GENERAL OBSERVATIONS 

It is not possible to draw strong conclusions from this exploratory 
study, but the results do support a few general observations. 

* while it is common to find documents that do not contain all search 
terms ranked higher than documents that contain all terms, the ranking 
performance of the search engines is generally good, considering the size 
of the databases and the number of searches performed on these systems. 
Excite and Lycos had the best average scores of about 5% for the first 20 
hits, indicating that only one item that did not contain all terms was 
ranked higher than those that did. Even the worst score of 16% (infoseek) 
meant that only three or four hits out of 20 were "misranked. " 

* The Proximity test produced best scores of less than 10% for the 
first 20 hits (AltaVista and infoseek), indicating that generally only one 
or two items that do not contain all terms as a phrase are likely to be 
ranked higher than items that do contain the phrase. Similarly, the 
Location test produced a best score of just over 10% (AltaVista), but most 
scores for this test were much higher. 

* Ranking was consistently more reliable within the first 20 hits 
than within the first 100 hits, in a few cases, however, the first 100 hits 
produced better scores than the first 20 hits. 

* Results varied widely by search topic. Some topics yielded very 
consistent ranking while others produced results lists with only a few 
documents that contained all terms scattered among many documents that did 
not. 

IMPLICATIONS 

These results have some implications for end-users. Many search 
engine comparisons and training sessions recommend using advanced 
techniques, such as Boolean operators, +/-, double quotes, or field 
searching. 

Leighton tested search statements employing Boolean operators, +/-, 
and double quotes, yet still reported that from 7% to almost 40% of the 
first 15 hits did not satisfy the search statement. These results are not 
dramatically better than those obtained in this study using relevancy 
searching, i.e., simply entering the terms without operators or modifiers. 
The best advice here may be to follow the suggestion of prominent search 
engine trainer Ran Hock and "disdain neither Boolean nor relevancy 
searching" [15] . 

similarly, these results suggest that the practice of viewing only 
the first few pages of results is a viable one. Search engine producers are 
aware of this trend, and may fine tune their algorithms to work with 
greater precision on the first few pages of results, while perhaps applying 
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grosser criteria on subsequent pages. 

Since the time of this study, search engine producers have begun to 
consider elements other than the content of the html document in their 
ranking algorithms. Excite, infoseek, and Lycos have added link popularity, 
i.e., the number of hyperlinks made to a page to their ranking algorithms 
[16]. It has also been suggested that all search engines employ this 
criteria [17]. 

HotBot, through a process developed by DirectHit, tracks the number 
of times sites are selected from results lists. Links to these "most 
visited sites" are displayed in a separate link that appears above the 
standard results list [18]. These approaches seem to be designed primarily 
to try to deliver relevant results for the high volume of one-and two-word 
searches on popular topics. They may also have the effect of directing 
users to the most popular commercial sites, making it more difficult to 
locate less popular, but highly relevant pages. 

The proprietary nature of ranking algorithms makes them difficult to 
explore. The algorithms are under constant adjustment, both to increase 
their effectiveness and to prevent reverse engineering by www optimization 
firms. Still, both search engine producers and end-users would benefit from 
increased attention by information professionals to this important element 
of Web searching. 
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abstract: Knight-Ridder, owner of Dialog information Services inc, has 
purchased Radio Schweiz AG, whose primary business is Data-Star, a database 
vendor providing access to business, pharmaceutical, medical and European 
directory databases. The purchase strengthens Dialog's European presence. 
Data-Star, which had become a notable competitor to Dialog, did not fit in 
with Radio Schweiz parent firm Motor-Columbus' strategic plans. 

TEXT: 

GENERAL NEWS 

Knight-Ridder Acquires Data-Star 

Knight-Ridder, owner of Dialog information Services, inc., has 
purchased Data-Star from Motor-Columbus, the Swiss engineering technology 
firm, for an undisclosed amount. Knight-Ridder acquired the equity of the 
Radio Schweiz AG (RadioSui sse) , whose principal business is Data-Star. 
Data-Star provides access to 250 medical, business, pharmaceutical and 
European directory databases. Dialog's service contains more than 400 data 
bases, primarily in the business, news, scientific and technical areas. 

Reasons for the sale include Dialog's desire to create a stronger 
European presence and Motor-Columbus' apparent change in corporate 
direction and ensuing decision that Data-Star was not a strategic fit with 
the rest of their business, in recent years, Data-Star has grown to become 
a worrisome competitor to Dialog in Europe and, to a lesser extent, in the 
United States. 

Richard R Ream, Dialog's vice president, Worldwide Sales & Service, 
estimated Data-Star's number of passwords to be 12,000-15,000; with less 
than 5% held in the U.S. Dialog has 150,000 passwords worldwide, with 
10,000-11,000 in Europe. Dialog's sales in 1992 were over $200 million and 
Data-Star's director, Rolando Henrich, estimated his company's sales to be 
$30-50 million. It might be worth noting that Knight-Ridder acquired Dialog 
for $353 million in 1988 when Dialog's sales were $100 million. 

Major plans by Dialog include moving its European center of 
operations to Bern and eventually creating a common platform for both 
services. Mr. Ream noted that the new platform would be "quite different 
from what exists now." For the present, Dialog will operate Data-Star 
independently, though combining service and sales areas. Customers will 
continue to contract for either service separately, though Dialog plans to 
develop a joint contract over time. Dialog will also eliminate areas of 
redundancy and introduce a two-way gateway between the two services. 

There will be no change for information providers in 1993; royalties 
will come as before from two sources, updates need to be sent to two places 
as before, and training and documentation will remain unchanged. 

Dialog says there will be no layoffs, however, some management 
structures will change. Heinz Ochsner and Rolando Henrich will continue in 
their current management roles at Data-Star, reporting to Martin Buerger, 
who has taken the new role of vice President, European Operations, 
reporting to Pat Tierney. 

Data-Star's London sales and marketing office, run by Peter Martin, 
will merge with Dialog's London office. Stuart urwin, formerly Managing 
Director, Dialog Europe, will head the consolidated office as Managing 
Director, European Sales and Service, reporting to Richard Ream. Peter 
Martin will aid in the transition and eventually assume new 
responsibilities at Dialog's headquarters. Look for an interview with the 
key executives of both these companies, including Pat Tierney and Martin 
Buerger, in an upcoming issue of ONLINE. 



/ 



Online Access To Library Of Congress Automated information Files Over 
The internet 

Librarian of Congress James H. Billington announced that the Joint 
Committee on the Library has approved online access to the Library's 
automated information files through the Interact beginning in late April 
1993. The files to be offered by the Library include all LC MARC files; 
copyright files, 1978 to the present; public policy citations, 1976 to the 
present; and federal bill status files. Both the technical 
processing/cataloging system (MUMS) and the reference/retrieval system 
(SCORPIO) will be accessible for searches over the Internet. 

The Library of Congress says it is able to offer remote access to its 
public databases via the internet as a free service, but must limit its 
customer support to documentation download over the interact. The Library 
will begin by providing system availability to 60 simultaneous internet 
users to ensure that service to Congress and onsite users is not degraded, 
usage will be monitored to determine if this number can be expanded if 
needed, but service to congressional users will continue to be the 
Library's primary goal for its online systems. 

OCLC Announces intent To Acquire Information Dimensions, inc. 

OCLC announced that it has signed a letter of intent to acquire 
information Dimensions, inc. (IDI) , a subsidiary of Battelle Memorial 
institute. The acquisition is dependent on the successful completion of a 
ninety day due diligence process. IDI develops and markets computer 
software products for managing electronic documents and text on leading 
mainframe computers, microcomputers, workstations, and PCs. IDI's two main 
software products, BASlSplus and ZylNDEX , help companies transform their 
documents into information databases that provide systematic access to 
large amounts of information. 

OCLC president and CEO, K. Wayne Smith, said, "IDI would be very 
attractive as a standalone operation. But, we believe that IDI also 
provides an exciting strategic fit for OCLC in full -text electronic 
publishing, electronic archiving, and information management—three areas 
of growing importance in OCLC's future." 

IDI was founded in 1986 as a for-profit subsidiary of the 
not-for-profit Battelle Memorial institute, a research and development 
center in Columbus, Ohio. IDI has 280 employees worldwide, 150 of whom are 
located in its headquarters in Dublin, Onio. Sales in 1992 were $32.5 
million, more than half of which were international. For more information, 
contact OCLC, 6565 Frantz Road, Dublin, OH 43017-3395; 614/764-6000; Fax 
614/764-6096. 

Government Printing Office Installs Galacticomm Bulletin Board 
The Government Printing office, the world's largest publisher, has 
installed a Galacticomm bulletin board. The for-fee Galacticomm BBS 
operates on an 80486-based PC and reportedly has nearly a gigabyte of data 
available. All federal agencies are now free to put downloadable files on 
the GPO BBS for public access. 

To obtain more information about the GPO BBS modem, users can log on 
to the system at 202-512-1387. The BBS call is free, except for phone 
charges, but users must set up an account to actually download data. The 
minimum download charge is $2, with a 1MB file costing about $20 to 
download . 

According to the opening screens of the BBS: user assistance is 
available from 8 a.m. to 4 p.m.. Eastern standard time, Monday through 
Friday (except Federal holidays) by calling 202/512-1524. Depository 
Library staff should call 202/512-1126. The BBS is available 22 hours a 
day, 7 days a week (it is unavailable each day from 3 a.m. to 5 a.m., 
Eastern standard time, for maintenance). 

New Bookstore Accessible Online 

Book Stacks Unlimited, inc. contains over 200,000 titles, searchable 
by title, author, subject, or Dewey Decimal number. Customers pick the 
titles online, place an order, and within five to seven days, books are 
delivered to their home or office. There are currently 12 lines available 
for incoming calls. The main modem number is 216/861-0469 for 2400 baud 
(8/N/l) and 216/694-5732 for 9600 baud. For more information, contact Book 
Stacks Unlimited, Inc., One Cleveland Center, 1375 East 9th Street, Suite 



2260, Cleveland, OH 44114-1724; 216/861-0467. 
SOFTWARE NEWS 

Network Aware Version Of Pro-Cite From PBS 

Personal Bibliographic Software, inc. (PBS) released Pro-Cite 2.1 for 
the Macintosh, a network aware version of its bibliographic database 
management software. Version 2.1 allows multiple users to access the same 
database at the same time from a network server. A Pro-Cite 2.1 database 
can be opened as read-only by multiple users, but to ensure data integrity 
only one user is allowed in a database while it is being edited. The number 
of users allowed into a database depends upon the licensing agreement 
purchased. Pro-Cite 2.1 is available in economical multiuser packs of 5, 
10, 20, 35, 50, and 100, as well as a single-user version. 

There are no visual changes in the menus or dialogs from Pro-Cite 
2. Ox. Enhancements include full support of Microsoft Word 5.0 and use of 
available memory to optimize time-consuming operations. All registered 
users of Pro-Cite 2. Ox will receive the single user version of Pro-Cite 2.1 
free of charge. Single licensed users of Pro-Cite 2.1 will be able to place 
their database(s) on a server to share with colleagues who also have 
licensed copies of Pro-Cite 2.1 at their workstations. Pro-Cite 2.1 for the 
single user is $395. Prices for the multiuser version will vary according 
to the number of users licensed to access the database. For more 
information, contact PBS, RO. Box 4250, Ann Arbor, Ml 48106-4250; 
313/996-1580; Fax 313/996-4672. 

Enhancements And Upgrades Offered To BRS/ SEARCH 

BRS Software Products announced Release 6.1 of BRS/ SEARCH and a 
Microsoft windows version of BRS/ SEARCH . Release 6.1 offers enhancements 
such as document analysis features, thesaurus browsing, WordPerfect 
filters, and interoperability across hardware platforms (x-windows on IBM's 
RS/6000 and under dec's Ultrix). 

In Release 6.1, a new tally command enables users to analyze selected 
paragraphs for occurrences of a term . The user will see a list of the 
terms used in those paragraphs , with occurrence and document counts 
for each term . using the tally command on the Course Names fields, the 
user will see a list of all classes for which students are registered and 
the number of students registered for each class. The new relevance ranking 
feature allows the user to designate fields, such as Title, Abstract, or 
Description, as having greater importance if a search term occurs in 
them. 

BRS/ SEARCH for windows takes advantage of windows features and 
presentations style, including clipboard facilities and a toolbar for 
common tasks. The windows interface also provides the first client portion 
of a planned client/server text- retrieval offering slated for delivery 
during 1993. For more information, contact BRS Software Products, 8000 
Westpark Drive, McLean, VA 22102; 703/442-3870. 
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At the ONLINE '87 conference in Anaheim, I gave a presentation on 
digital networks. One piece of advice was to keep in contact with your 
phone company, value-added network, or any group whose decisions regarding 
telecommunications might affect you. in November 1987, 1 received an 
invitation from Pacific Bell to look at numerous projects underway at 
Bellcore which will have an impact on library and information center 
services. 

Bellcore, or Bell Communications Research, was established to provide 
support for the seven regional Bell companies formed after the breakup of 
AT&T. Most of the 7,600 employees are located in New Jersey. They are 
involved in applied research, network planning, technology systems, and 
software technology. Much of their work is of interest only to telephone 
company engineers and network managers. However, the presentation held at 
Pacific Bell was a rich mix of applications that 

will have a very powerful impact on all of us. For more information, 
contact Bellcore at 290 w. Mt. Pleasant Avenue, Livingston, NJ 07039. 

in the four hours I spent at presentations and booths, I gathered 
enough ideas and material for several workshops and a couple of books. It 
was a delightful example of information overload. Briefly, each presentor 
had 25 minutes to show one or more new developments: new voice phone 
technology, services to the deaf using speech synthesis (speech to 
text-text to speech); high definition television; network management 
software; mass market information services such as Teletel in France; 
integrated Services Digital Network developments including full motion 
video over fiber optic phone lines; and information services within 
Bellcore. 

My column will concentrate on the innovations in information services 
being tested by Bellcore and Teletel. Three, interrelated developments were 
shown, and I had a chance to speak with some of the researchers and read 
the papers they had written on their projects. Much of the detail comes 
from these papers rather than my notes. 

THE BELLCORE ADVISOR 

The Bellcore Advisor is a system that allows the user to input a 
keyword to find parties within Bellcore who can answer a specific question. 
What sets this apart from systems in use at present is its ability to 
recognize plain English queries, to relate them to 7100 technical terms, 
and to assign a relevance index from -1 to+ I to each hit or citation (in 
this case it would be a technical group within the organization). Any fit 
index, as it is called, over .5, indicates you have a good chance of 
finding what you want to know. The fit index is assigned by using a 100 
dimension matrix, i.e., the program looks at your question from 100 
different angles or criteria and makes a judgment about the sum. I input 
the term "packet radio" and immediately the Sun workstation displayed a 
list of eight people or groups for me to contact. Each citation included 
voice phone and electronic mail address. The highest fit index was about .7 
which the speaker said was very good. Some other examples had a fit index 
of .95. You can see how valuable a tool this would be for businesses, 
technical support organizations, and large libraries or library 
cooperatives. Applying this to a community information and referral file 
might be too large an undertaking because the terms are so numerous. Still, 
it was an exciting tool that we could use every day. 



SUPERBOOK - A TEXT BROWSING TOOL 

Thomas Landauer presented SuperBook, a text browsing tool running on 
Sun workstations. It could run on an optical disk, but a hard disk was used 
for the demonstration. Landauer knew that delivery of documents in 
electronic form was fast and efficient, but that using them in that form 
was not very attractive because of small video screens and the slow paging 
or scrolling through on screen text. He and his team also wanted to improve 
the way people obtained information from reference materials, as most of us 
know, many failures are caused by choosing incorrect search terms. 
Increasing the number of access points for an object or a term can raise 
search success rates by a factor of four. 

SuperBook also builds a full -text index to accompany a dynamic table 
of contents that shows varying levels of detail. This table can be expanded 
or reduced according to the scope of the user. The number of times a 
word occurs is shown next to the title of each part of the table of 
contents. Just to the right of the column of text is a blank space for 
annotations by each reader. There is no limit on the length of the 
annotation, which is marked by the writer's initials and the date. 

SuperBook is being used with a Bellcore technical document that 
contains about four megabytes of text divided up into sections and 
sub-sections, seven levels deep. SuperBook shows four windows: the title, 
table of contents, a page of text, and a word lookup. The last window keeps 
track of all the search terms. The word "queuing" was a search term that 
was found throughout the document, when it was entered, the table of 
contents shows the number of times the root "queu-"occurs in each section. 
Another feature is called adaptive indexing;" we might call it cross 
referencing. The words phone company are not used in this document, but 
TelCo is. The user can link one term with the other, so that anyone can use 
the new term. Each user can enrich the searching capabilities of the 
system. For further enrichment, users can make margin notes with their 
login id and date affixed, as, with other hypertext systems, SuperBook 
helps the reader overcome the limitations of linear, printed text. Some 
users suffer from a temporary orientation; they can't be sure they are on 
page 130 in a 350 page book. At present this works in a UNIX environment 
with a 19" monitor. Because they intend for the system to handle existing 
text documents, considerable effort has been expended to allow SuperBook to 
accept many different word processing formats. SuperBook 's preprocessor 
reads and analyzes online text, builds an index and formats it for the 
browser. A fifth window, for graphics, has not been implemented yet, but it 
is included in the diagram of a Sun workstation screen. 

TELESOPHY - A NETWORK INFORMATION RETRIEVAL SYSTEM 

The third demonstration was a network information retrieval system 
developed by Bruce Schatz and Stephen Bulick. Telesophy, as it is called, 
is a system that is based on the as sumption that the end-user should be 
directly involved because he or she is the best judge of relevance and 
because the number of professional searchers is so small compared with the 
broad market of information consumers. Telesophy, or knowledge at a 
distance, assumes that equipment will have cheap and readily available 
bandwidth, graphics and text in digital form, and that the terminals will 
be equivalent to engineering workstations of today with a two to four MIPS 
processor, graphics interface, and more than two megabytes of memory. The 
Bellcore example, which is being used by a small group in the company, runs 
on Sun workstations running on a 10 MB/second Ethernet. Some of the Suns 
are used as "Index Servers' using inverted file indices of keywords as well 
as two other types. They are still experimenting with other ways of 
accessing the information which includes electronic mail, wire service news 
feeds, the online catalog from the Bellcore library, a dictionary, a movie 
database, Usenet and ARPAnet groups, full text of popular magazines and 
several years of the INSPEC computer and control database. Although it is 
being used only in a laboratory setting, the technology to implement it is 
here, at a price. I have strong doubts that the market will demand this 
sort of service before 1990. However, I worry more that the regulatory 
apparatus will hold back significant developments in the telecommunications 
infrastructure that would make use of the full powers of these systems. 
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At the international Online Meeting in London in December 1992, Dialog 
information Services previewed a new RANK command. The new command is an 
analysis tool that reveals statistical trends in search results — it 
counts the occurrences of unique terms within a specific field or 
fields from an established search set, thus allowing searchers to 
pinpoint essential information. 

The RANK command is available in most DIALOG files and has been 
designed to work in most phrase-indexed additional index fields, most 
numeric additional index fields, and with phrase-indexed descriptor and 
identifier basic index fields. 

Analyzing search results can be done on up to 50,000 records and 
during the first three months of availability it will be offered free of 
charge. Thereafter it will cost $0.02 per ranked record. More information 
is available from: Dialog information Services, 3460 Hi 11 view, Palo Alto, 
CA 94304. Telephone: (415) 858-3785. Fax: (415) 858-7069. 
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... restrict a list of synonyms, but it is a shorthand for a Boolean 
expression . 

Weighted searches assign a score that attempts to measure how well 
a document fits the query. This... 

...word scoring schemes. The simplest method is a tally of the presence or 
absence of query words from a document. No special weight is assigned 
to one term over another. The second method is to report the number 
of occurrences of each word or pattern. It is assumed that the 
documents with the most hits are the best ones. A trick used on the 
internet to raise the score of a web site in search engines with this 
approach is to fill a comment field with repetitions of a few key words 

in a mixed strategy, each word or pattern gets a weight, which is 
multiplied by the number of occurrences , to give a score. This strategy 
is a little harder to implement, because you must... 

...method is really a special case of this, with a weight of one for each 
search term. 

The fourth method is to have a semantic tree structure that assigns 
heavier weight. . . 

...This means that the thesaurus must know the difference between broader 
and narrower terms. A search for "Southwest-Indian?" would give more 
points to documents containing names of particular tribes ("Hopi . . . 
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Titles allows inspection of an alphabetical list of all article 
titles, and subsequent selection. 

word Search allows entering of single words or phrases with full 
Boolean capability; words may be searched... 

...g., kidney, renal), internal and right hand truncation is permitted. 
After searching for the specified words , CONSULT retrieves a list of 
titles in which the searched criteria was met and then ranks the 
retrieval according to the number of occurrences of the searched 
words . 

Figure 7 was compiled from a word search on "seborrheic 



dermatitis." Having chosen an article from the list , the user will be 
placed at the first occurrence of the searched terms within that article. 
CONSULT becomes even more interesting! while in a full -text monograph 



...be displayed. (And yes, there is a photograph for "seborrheic 
dermatitis.") One public user began searching consult and was retrieving 

photos of black widow and brown recluse spiders in seconds! As with any 
good medical . . . 
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Boolean operations in limiting or expanding the number of articles 
found. Once satisfied with the search , you can display a list of the 
titles found. 

The Browse menu item works differently. . .word indexed in the database. 
A text entry line appears at the top of the list . As you enter a word 
on this line, the list narrows down to the word or words that match 
the letters typed thus far. Each word on the main list includes the 
number of times it appears within the three years of nejm text, when you 
find the word you want, press Return or F10 to see a list of titles 
that contain that word . 

You cannot save Search and Browse strategies. DiscPassage does not 
support search operations that use such operators as NEAR, with, within 
(X) WORDS OF. Also, you cannot search illustration captions and table 
text. It has a sparse but usable help system. 

The Contents. . . 
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can assign password access privileges to documents or entire 
libraries. 

The browser module lets users search across libraries of documents 
in the database and retrieve individual pages or entire documents to 
their PC workstations, it also provides bookmarking and annotation 
features. Users can perform simple or complex Boolean searches ; the 
software supports wildcards, and multiple Boolean operators for refining 
search criteria. During a search , a scrollable keyword list can be 
displayed to show all matching keywords found and the number of 



occurrences . 

The bookmarking feature lets users annotate pages with comments of up 
to 20 characters in length; bookmarks can also be used to construct 
cross-reference hyperlinks between pages or documents. Once defined, 
bookmarks can be accessed quickly and directly. Additional page... 

...text format conversion steps are necessary, in addition, the full 
indexing capability provides a definitive search for rapid retrieval of 
information. 

it appears the software will be most useful in document... 
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... to be used in applications such as reference manuals or price 
lists, where documents are searched and updated on a regular basis to a 
distributed audience. 

Builder and Reader modules 

There. . . 

...indirect method uses artificial intelligence techniques to create the 
indexes required. It analyses text in terms of word distribution and 
the number of times words appear in a document. This information is 
then used to build the cross-references. 

With either method, the program creates outline documents, indexes 
and automatic links . The outline document is similar to a table of 
contents, and the automatic links depend on the structure of the source 
document. If. . . 
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same but is case-insensitive. 

* atlineO also has a case-insensitive version, ATCLINEO , to find 
the line number of a string. This is very useful in memo fields. 

* RAT () and ratlineO determine the start of a string in reverse, 
beginning at the end. 

* BETWEEN () determines if an expression falls between two other 
expressions, whether character, numeric or date. 

* OCCURSQ determines the number of occurrences of one string in 



another. 

* inlistO is a similar, redundant function. 

* CHRTRAN() translates characters of a string using a translate 

table. 

* STRTRAN () searches for a string and replaces it, just like a 
word processor search and replace routine. 

* min() and max() work on any kind of data. 

* SECONDSQ returns the system... 
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that are contained in records that match your criteria. For 
example, if you want to find out the total amount by which your accounts 
receivable is in arrears, specify ©SELSUM to... 

...OCCUR versus 1-2-3's or Symphony's @DCOUNT emerges when you want to 
count occurrences in a table that you haven't set up as a 1-2-3 or 
Symphony database. ©DCOUNT requires you to add field names to the top 
of the table and to create a Criterion range. ©OCCUR lets you skip those 
steps. ©OCCUR also lets you count occurrences in more than one column 
. @DCOUNT does not. 

0SELSUM and ©OCCUR share one drawback, when specifying criteria for 
summing or . . . 
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with the quick look-up index requires selecting the field or fields 
on which to search and entering the search criteria. The program 
immediately reports the number of matches and highlights the text string. 
At. . . 

...display the other matches. Because the quick look-up index covers the 
entire database, the search may take some time if your database is large. 

If you need a faster search and know you do not need to look 
through the entire database, you can use a list search . Marcon Plus 
provides .three types of indexed lists: add-on, unique, and preset. The 
addon list consists of an index and an occurrence counter ; that is, 
it tells you how many records contain the word in the list . A unique 
list is an index of a field that contains a unique value such as a 
document number. A preset list is an... 



i 



...the word does not exist in the list, you cannot enter it into the field. 

Searches can be simple one-word queries , or they can include one 
of the four Boolean operators. You can also search across ranges and 
within a specified proximity of a given word. The proximity search can be 
fairly exact. For example, you can search for "cnips pre/5 ROM 1 for 
documents that contain both words within five words of... 



24/3 ,K/9 (Item 1 from file: 636) 

dialog (R) Fi 1 e 636:Gale Group Newsletter DB(TM) 
(c) 2006 The Gale Group. All rts. reserv. 

02882699 Supplier Number: 45851485 (USE FORMAT 7 FOR FULLTEXT) 
At Presstime: Bell Atlantic Launches internet Directory Trial 

Yellow Pages & Directory Report, vll, nl7, pN/A 
Oct 11, 1995 

Language: English Record Type: Full text 
Document Type: Newsletter; Trade 
Word Count: 227 

to Advertise feature allows first time advertisers join the 
service. Existing advertisers can upgrade their listings by adding 
hyperlinks to tneir home pages, additional category listings , more 
detailed information, or links to coupons and sales information. 

users can search by category and city, and narrow their searches 
by company name , brands, and products and services. The service also 
offers government listings and community event information. 

Bell Atlantic plans to track the number of times a user accesses a 
listing and collect feedback through user questionnaires. 

The interactive Yellow Pages is linked to Bell Atlantic... 
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Retriever permits editing and annotation of collected data -- a 
capability not available with competitors 1 products. 
Retrieving the goods 
I initiated a web Retriever session by specifying several internet and 
intranet sites. . . 

...HTML files, requiring you to individually open files to do a search. 

web Retriever's search function is excellent. The software's full- 
word indexing yielded very fast responses to complex queries involving 
multiple words , such as Dole, Clinton, or Perot. The Results Map part of 
the query dialog box summarized the number of times each word or 
phrase appeared in the database. 

Afterward, I switched to the Table of Contents view in order to see 
how many hits were in each heading within the database and then jumped to 
specific passages where the requested words were clearly highlighted. 

Making a link 

web Retriever maintains internal hyperlinks in the converted 
database, so maneuvering through the downloaded information was especially 
easy. 

when a. . . 



...inserted Margin Notes, which are pop-up windows containing comments or 
other notations. 

Furthermore, web Retriever lets you insert a special Query Link 
within a database. As the name indicates, clicking on this link performs a 
preassigned. . . 
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terms and found twenty-six mentions of terms such as upset, angry, 
and uptight. This finding is reported in table 4 and corresponds to the 
frequency score of 26 in the "Emotionality" row of the "Relationship 
conflict' section in the Domestic Coding unit frequency column . 

(TABULAR DATA 4 NOT REPRODUCIBLE IN ASCII) 
contextual ratings. The number of times a term is mentioned by 
an informant or a group is identified by frequency counts, but the meaning 
surrounding the term (e.g., a high or low level of relationship conflict) 
is not. Therefore, three research... 
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users and repeatedly send them directly to the intranet. 
Administrator interface features also include usage reports , allowing 
managers to track the number of visits to specific areas of the intranet 
and the number of times articles have been viewed. 

Articles from Dow Jones are drawn from thousands of publications in 
Dow Jones interactive to match topic profiles customers have requested. 
Content managers can view an index of folders and articles... 

...e-mail articles, add commentary, mark articles "hot," and delete 
articles as necessary. To ease searching on their intranets, they can 
also apply Dow Jones and internal coding to articles. 
Building. . . 
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Library Software Review, vl7, n2, p90(49) 
June, 1998 
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... Back button to come back to this section.) 

if you wish to use a previous search in the body of a new search , 
select the search from the Search History box, and click the Retype 
button. It is immediately copied to the Search box. 

if you wish to clear one or more searches from the Search History 
box, click the Clear button, and select the appropriate option. 

The following two sections on the PsycLiT web page are the same as 
those shown in the ERIC sections of this article: 

viewing Your Results Marking, Printing, and Downloading Records 

Using the Index 

The index provides an alphabetical list of all terms in the 
database. It lets you view the number of articles (records) in which your 
search term appears as well as the number of times it is found in 
the database. You can search for and retrieve results on the same 
screen. 

Try one of the following methods. 
Enter a term (or. . . 

...sure to place a hyphen between words in a phrase. 

Click the Lookup button to search for relevant articles that display 
below. 

when existing index terms display below, scroll to view... 
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Retriever permits editing and annotation of collected data -- a 
capability not available with competitors 1 products. 
Retrieving the goods 
I initiated a web Retriever session by specifying several internet and 
intranet sites. . . 

...HTML files, requiring you to individually open files to do a search. 

web Retriever's search function is excellent. The software's full- 
word indexing yielded very fast responses to complex queries involving 
multiple words , such as Dole, Clinton, or Perot. The Results Map part of 
the query dialog box summarized the number of times each word or 
phrase appeared in the database. 

Afterward, I switched to the Table of Contents view in order to see 
how many hits were in each heading within the database and then jumped to 
specific passages where the requested words were clearly highlighted. 

Making a link 

web Retriever maintains internal hyperlinks in the converted 
database, so maneuvering through the downloaded information was especially 
easy. 

when a. . . 



...inserted Margin Notes, which are pop-up windows containing comments or 
other notations. 

Furthermore, web Retriever lets you insert a special Query Link 
within a database, as the name indicates, clicking on this link performs a 
preassigned . . . 
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Bibliography formatting software: an updated buying guide for 1994. 

Stigleman, Sue 
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1: changes include increases in record and field sizes; character 
formatting; duplicate checking; more powerful searching ; more flexible 
output options; greater use of windows; importing module moved into program 
(although still an optional purchase); more flexible importing; mouse 
support; global search and replace. 
Reference Management System 

Upgraded to 3.2b: more online help; more customization; new duplicate 
checking; extended search options; new short-form browsing display; 
search and replace. 

Reference Manager 

Acquired by Thomson Corporation, which also owns ISI. Released 
windows version. . . 

...version 6.0 for MS-DOS and windows released: many more publication 
types; user-definable fields ; save search strategies; global addition 
of key words to retrieval sets; expanded reference id number; ability 
to use author and year as reference ID; lists of keywords , authors, and 
journals have number of occurrences ; can use first names for authors 
and editors; choice of "exacting" or "forgiving" duplicate detection. 
RefMenu 

Version 4.1: greatly expanded capability for Boolean searching of 
the notes; full reference shown with matching notes. 

REFS 

No change to main program; is more "network-aware" and supports a 
number. . . 
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HOW TO EXPAND 

The format of the expand command is usually the same as the SEARCH 
command. Depending on the system, you can specify in which fields the terms 
should be. . . 

...a display of terms nearby in the index. 

The usual result, as seen in the Table , is a list of terms 



X 



numbered on one side with line numbers that can be used later to create 
sets or for further EXPANDing if a thesaurus function is available. Another 

column indicates the number of times the term appears in the 
database. 

TABLE 1 

expand Commands on Major Online Systems 



SYSTEM 
BRS 

DATA- STAR 

DATATIMES 
DIALOG 

IN FOG LOBE 

LEXIS 

NEWSNET 



COMMAND 

root term 
. .root term 
. . list term 

under development 



SELECT CONTINUE 
R# 

R# * 
R# * 



THESAURUS 



.THES term 



EXPAND term S E# 

expand field =term 

DICT term 

Does not exist 

Does not exist 



EXPAND E# 



NLM - ELHILL NBR. . . 
. . .term(field) 

westlaw Does not exist 

You can then put the line numbers into a search statement and they 
will be treated as the term itself; you do not have to retype the string. 
This is an easy way to save time. Retyping some search strings such as 
for cited references can be frustrating. Remember that most systems allow 
you . . . 
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The Serials Directory/EBSCO CD-ROM. (CD-ROM Review) 
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executed in the CD version, 

wild card and righthand truncation may be used in most search 

fields, including Search Limiters. Searching for phrases containing 

punctuation or stopwords is remarkably easy, unlike... 

...bank for america or bank of the america). The program returns any 
variation of hyphenated terms (for example, post-adolescent, post 
adolescent, postadol escent) . This forgiving feature greatly reduces the 
number of times a query returns a puzzling and frustrating "No Hits 

Authority Files Searching 

The Authority File search method allows the user direct access to 
the lists of valid entries for the Subiect, Title , Publisher, and index 
and Abstract indexes. This second search option is useful, especially 
when the searcher is looking for specific or known information. Since one 
is directly accessing the index files (indexes here in t database sense), 



the system executes searches much more quickly in he Authority Files than 
in the Query Profile 

At user levels 3 and above, searches from either the Query 
Profile or the Authority Files may be saved, then recalled and re-executed 
at a. . . 

. . .Results 

A brief version of the result list is automatically displayed upon 
completion of a search . The user may browse the list in several ways: 
using the arrow, page up/ down... 
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?uestion mark (?) for a single character and asterisk (*) for a 
characters. Reset and Search buttons are available from the 
bottom of the template. 

When you start a search, the... 

...display and you can either cancel the search or choose OK to bring up 
the title list . 

The title list window displays the total number of articles and 
occurrences and then each article proceeded by the number of 
occurrences within the article, you can sort the title list 
alphabetically or by occurrences. When you open an article the search 
terms appear in bold text; "remote control" arrows appear at the top left 
of the screen for jumping to the next, previous, beginning, or end 
occurrence. 

The Search menu includes a choice for setting search options. A 
limiting feature allows you to select one or more alternatives among 
titles, text, bibliographies, fact boxes, and picture captions. A proximity 
box sets phrase searching to the same article, the same paragraph, or 
within a specified number of words. An... 

...to specify whether the word must be found in the same order specified by 
the search . 

The Timeline and Knowledge Tree provide additional search strategies. 
The Timeline includes over 5,000... 
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zyindex for windows. Version 5.0- (Software Review) (one of five 
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Marshall, Patrick; Watt, Peggy 
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term must come first. The program can do fancy tricks such as range 
and quorum searches . 

Finally, one of Zylndex's most powerful capabilities is its field 
searching. You can define... 

. . .difficulty on our progressive search task. 

with its clearly laid-out buttons to access previous searches , 
fields , and concepts, Zylndex's Search Request screen is an exemplar of 
efficiency. You will also find buttons that pop up a thesaurus for 
search terms and on-line help to construct search arguments, 
vocabulary shows a list of the entire contents of the index along with 
the number of times each word appears . 

The program offers flexibility for getting the actual data once you 
have retrieved a set of files, you can set the Search Results display 
in either of two modes: a simple listing of retrieved files that shows 
the number of hits in each (along with the file's path... 

...in context) view that displays each hit in the context of its 
surrounding text. 

The Search Results screen also offers a row of buttons that will 
return you to the search screen, show the highlighted file, print files, 
or launch them into their parent application. Once... 
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user to limit a search word to a specific field. The F4 key (Show) 
displays search results on the screen. The arrow keys now allow scrolling 
line by line within a... 

...display and print. 

information that is database specific, such as database terminology, 
stopwords, or a list of limit fields , can be found through the F3 key 
(Guide). A list of all searchable terms , except the limit terms , is 
available through the F5 key (Index). All index entries are listed with 
the number of times they occur in the database and the number of 
records in which they occur. 

Discs. . . 

...F8 key (Xchange) . This is a helpful feature because it has been modified 
to allow search strategies to be rerun on other SilverPlatter databases. 

EQUIPMENT 

SPIRS2 runs on IBM PCs or... 
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Answers on the disc: general encyclopedias on CD-ROM. (Reviews & Product 
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Hollens, Deborah; Rible, Jim 



CD-ROM Professional, v4, n4, p54(7) 
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title index at the alphabetical point closest to what you have 
typed. You can then retrieve the full text of the article listed. 
The screen always indicates what page of the... 

...easily through an article with an outline by simply using, the cursor to 
indicate what section you wish to see. Bibliographies, if available, are 
indicated in the outline. 

Browse Word index takes you into an alphabetical list of the 
136,750 unique words in the encyclopedia. The number of articles in which 
the word appears is listed along with the number of times the word 

is used in those articles, it is useful for checking the proper spelling 
of a word or to see how many times a particular word is listed in the 
work. Before using truncation, one can easily determine if the word root 
selected. . . 

...Search resembles an online version of the encyclopedia in that the user 
can execute Boolean searching on combinations of keywords in the full 
text. This is the most sophisticated portion of the program and the section 
with the most potential for the researcher. At the word Search menu, you 
can specify a combination of search terms to be used, to enter a search 
such as "(chemical or biological) and (weapons or warfare)" you type the 
phrases chemical, biological... 
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index that you can EXPAND on would make the process of personal and 
corporate name searching more predictable and accurate. * ED= Something 
needs to be done to catch all the variant... 

...edition and all stories from all editions online; then add to each 
record an edition field , which can be a search key (ED=) and a limitor 
(/x edition). * Rank Terms We can rank files in DIALINDEX, showing which 
contain the most occurrences of our search terms . Now let us rank the 
records retrieved in full -text article searching by the number of 
times our search terms are mentioned in them, thereby giving us a 
prioritized list of articles to scan. * The KWIC window The default KWIC 
limitation of 30 words to surround a search term doesn't always 
provide sufficient information when scanning newspaper articles for 
relevance. Expanding the window... 
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Bureau of Electronic Publishing makes history- (U.S. history on CD-ROM) 



(evaluation) 

Desmarais, Norman 

CD-ROM Librarian, v6, n2, p24(5) 

Feb, 1991 

DOCUMENT TYPE: evaluation ISSN : 0893-9934 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 3181 LINE COUNT: 00247 

menu to select any images or tables associated with the document; 
and FlO (Titles or Search ) locates titles from the search or words 
used within documents. Pressing shift-FlO (or Shift-Left Arrow) will 
locate previous matches within the article. 
Browse 

Browse mode provides an alphabetical list of words linked... 

...database (see Figure 4). You can use the cursor control keys to scroll 
through the list . Alternatively, typing the first letters of the desired 
word on the Find : line moves successively closer to the desired term . 
Each word lists the number of occurrences . You can then select and 
view documents as described under Search . 
Contents 

The Contents option lets the user peruse the categories and books 
that comprise U.S. History on CD-ROM. In essence, this feature allows... 

...Effect on Society" and "The world wars: 1914-1945" subcategory of wars 
and Conflicts." 

Upon retrieving a document, the user can display it for viewing 
(see Figure 5). DiscPassage highlights the... 
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Focus on: Global change, (database of bibliographic citations to 
environmental matters) (evaluation) 

Weinschenk, Andrea 
RQ, v30, nl, pl01(2) 
Fall, 1990 

CODEN: rqrqaq document TYPE: evaluation ISSN: 0033-7072 

LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 1265 LINE COUNT: 00099 

on a special form provided by ISI. 
A Search mode is also available for Boolean searching . The Search 
screen is in menu form. Terms can be combined with Boolean operators in... 

...select these fields in order to search the different parts of the 
database. Like citations retrieved in Browse mode, Search mode 
citations may be marked for inclusion in the PIC, GA, and RAP lists . 

There is a Dictionary function for the various fields . By typing a 
word , a word stem, or a letter, and typing the dictionary command, users 
are put into a list of words that shows the number of occurrences 
in the issue of Focus On.- Global Change currently open. Terms may be 
selected from the list and posted in a search statement for searching 

As with other isi databases, subject access is by words in the title 
. No enhancement of titles exists. Theoretically, this should not be a 
problem with scientific writing... 

...of disciplines does not appear in the manual. 

Printing article citations from the Browse or Search modes is 
straightforward. Pressing p" for print begins the process. Printing lists 
of articles from, . . 
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The international Encyclopedia of Education, CD-ROM. (evaluation) 

Urrows, Henry; urrows, Elizabeth 
CD-ROM Librarian, v5, n4, p22(6) 
April, 1990 

DOCUMENT TYPE: evaluation ISSN: 0893-9934 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 3533 LINE COUNT: 00296 

graphic chart. Right now Graphic krs exists for computers that use 
the GEM/3 (Graphic Retrieval software from Digital Research, inc.) 
envi ronment . 

we were conditioned to expect full -text word searching that could 
find a word in the directory and find its total number of 
occurrences quickly by searching through millions of words . We should 
do topic searches , entering a term such as optical media and soon get a 
readout listing titles beginning with the closest alphabetical match . 

Multiple word search ought to come through "operators" spelling out 
word relationship options: adjacency, like internal revenue; such... 

...can enable a user to negate King as a secondary term if we want to find 
all references to Martin Luther without also collecting Martin Luther 
King. Truncation could permit us... 
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CompuServe's SIGs: on the frontier of civilized searching, (includes 
related information) 

Glossbrenner , Alfred 
Database, vl2, n5, p50(8) 
Oct, 1989 

ISSN: 0162-4105 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 4715 LINE COUNT: 00355 

every keyword used in a given library, sign onto the SIG, enter the 
data libraries section and choose a library of interest. Then open your 
software capture buffer to record incoming information. At the library menu 
prompt, type in KEY and hit [is less than] Enter [is greater than]. This 
will generate a list of all the keywords in the library, preceded by a 
number indicating how many times each has been used within that library 
(Figure 7). when the list is complete, close your capture buffer and type 
in OFF to sign off the system. 
FIGURE 7 

THE RIFLE-SHOT KEYWORD APPROACH 

The best way to stack the deck in your favor when searching a SIG 
library is to start by calling up a list of every keyword subscribers... 
. . .attached. 

with the keyword list in hand, you can't miss whether you conduct a 
search or opt to browse through the library. The list shown below has been 
shortened to. . . 
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zy index: bringing order to electronic chaos, (evaluation) 

Powell, Antoinette Paris 

Library Software Review, v8, n3 f pl55(4) 

May-Dune, 1989 

DOCUMENT TYPE: evaluation ISSN: 0742-5759 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 3286 LINE COUNT: 00241 

list was and I found it in a matter of seconds with ZylNDEX. 
ZylNDEX handled searching of MARC records well and I was successful at 
retrieving records by author, title, ISBN, OCLC numbers, and a multiple of 
other access points. The... 

...program from zyLABS that provides additional features to ZylNDEX. 
ZyFEATURES includes the capability of doing field -specific searching 
and creating macros for repetitive searches . It has an "on the fly" 
search that allows the user to do proximity searching without limiting 
the number of terms between .in addition there is a thesaurus that lists 
a maximum of 15 synonyms for a term and the number of occurrences of 
each term in the index list . The thesaurus allows users to add their 
own core words and synonyms to it. 

ZyFEATURES can be installed at any time and installation will 
encompass. . . 
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Beyond Medline: a review of ten non-Medline CD- ROM databases for the health 
sciences. 

Fryer, Regina Kenny; Helenius, Majlen 
Laserdisk Professional, v2, n3, p27(ll) 
May, 1989 

ISSN: 0896-4149 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 5677 LINE COUNT: 00477 

current year disk. Back years are available in two year segments at 
$1000 per disk. 

Search Software 

The Life Sciences Collection offers both a menu-driven and dot level 
command mode. . . 

...fields, but no specific field selection is equivalent to a global search 
of all the fields . 

One can browse a dictionary file, but cannot select terms from the 
list and must therefore reenter each term . Phrase searching and left 
and right truncation are available. Earlier search statements can be reused 
and combined with new terms . The program displays the number of 
occurrences of a term and the number of documents retrieved. The system 
holds ten search statements per session. Displayed citations include the 
field abbreviations and the field name . This gives a cluttered look 
and the double field presentation does not enhance the display. Both 
display and printing can be customized; the default displays all fields . 
Citations can be easily downloaded to a disk. A window showing the 
execution of downloading... 

...for reuse by means of a macro. 

The dot level mode allows the user to search by a series of 
commands. S precedes all searching , E is the expand command to view 
dictionary terms, D is the display command, and P is the print command. To 
Dialog searchers these commands will have a familiar feel. The dot 



command system makes available all features... 



24/3,K/29 (Item 19 from file: 148) 

DIALOG (R) File 148:Gale Group Trade & Industry DB 
(c)2006 The Gale Group. All rts. reserv. 

03715504 SUPPLIER NUMBER: 06854262 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Publish or Perish: bibliographic data management. 

Thomas, Lynn L. 

information Today, v5, nlO, pl3(2) 
Nov, 1988 

DOCUMENT TYPE: evaluation ISSN: 8755-6286 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 1756 LINE COUNT: 00131 

... the twelve character limit for each keyword too constraining for 
convenient use, but others may find the discipline it imposes a useful 
way to order the work. More than one Note... 

...bibliographic entry. The numbered Note Card lines allow users to easily 
allocate space for separate categories of information. 

The Publish or Perish main menu allows users to display the list of 
keywords alphabetically, so they can browse that list or use it to fine 
tune their spelling before they do their search . The keyword screen 
gives the first nine letters of keywords , and the number of times 
each keyword appears. A patient user could become adept enough with a 
given set of information to use the search routines responsi vely . If 
users have more keywords in the file than show on one screen (60) they 
are prompted to tap the... 

...limits. If users wish to examine all their references, they can enter 
nothing for a search , and then just enter FI at each entry to move on to 
the next entry. . . 
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Software choices for in-house databases, (includes related article) 

Tenopir, Carol; Lundeen, Gerald w. ; Hane, Paula J. 
Database, vll, n3, p34(9) 
June, 1988 

ISSN: 0162-4105 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 5443 LINE COUNT: 00448 

the indexes varies, but is often over 100%. 
IS&R packages offer a variety of search features, typically 
including: Boolean logic with nesting, truncation, proximity searching , 
set building, range searching . They frequently offer a choice of output 
formats, but report writing capabilities vary. Because these... 

...strengths and weaknesses. INMAGIC offers power and flexibility in the 
database design process and in report generation. Fields may be 
designated as word indexed, phrase indexed, or both. This adds power to 
controlled vocabulary fields by allowing subject headings to be searched 

as complete bound phrases or as individual words within the subject 
headings. Field and file size are unlimited, and fields may be repeated 
any number of times in a file. 

INMAGIC allows nested Boolean operations, truncation, and set 
building, plus it can... 



.called 'BiblioGuide: using 



INMAGIC in Libraries. 

Personal Librarian (formerly called SIRE) offers unique and powerful 
search capabilities, but limited editing and printing features. A new 
version should be available by mid... 
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OCLC Search CD450: Education Materials in Libraries (emil) . (Online 
computer Library center) 

Sabelnaus, Linda 3 . 
RQ, v27, n3, p416(3) 
Spr, 1988 

CODEN : rqrqaq DOCUMENT TYPE: evaluation ISSN: 0033-7072 

LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 

WORD COUNT: 1863 LINE COUNT: 00140 

number of hits accumulated during the search. Past searches can be 
modified by entering the search set number and an exclamation point and 
then the new search terms. 
There are three. . . 

...middle of viewing retrieved sets. 

One of the EMIL CD's best features is subject searching . The index 
is arranged alphabetically and includes for example, Library of Congress 
subject headings, title words , author names , and company names . The 
index is a pop-up screen, and the terms can be chosen by pointing and 
"shooting." One word of warning, the index lists the number of 
occurrences of the term in the database. This is not equivalent to the 
number of records because the same term could be present several times in 
the same record. The bound subject terms can be free-text searched , but 
this is a problem because the user probably will not know the Library of 
Congress subject headings and may generate a huge set. To limit some 
searches can be very difficult, because the only other subject access is 
through the title. 

OCLC . . . 

...in the OCLC catalog. However, OCLC does have a way to go to perfect its 
search software. Error messages when logging into the database will put 
the user into a never... 
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Reviews: CD-ROM - Health Reference center 

Ashworth, Wilfred 

New Library world v96nll24 PP: 35-36 1995 
ISSN: 0307-4803 3RNL CODE: NLW 
WORD COUNT: 558 

...TEXT: medicine and training, and law and medicine, as well as the 
central medical issues. 

The search process operates on two levels. Simple word input leads to a 
scrolling list of headings and their subdivisions, and definitions of 
terms. The scrolling list also includes journal titles and authors' 
names. More extended searches may be called for and these search for 
wanted terms in the whole text of the database, combining two or more as 
a Boolean and. Against each requested term the number of occurrences 



is shown as a guide, and where a combination still produce too many hits 
only 1,000 are listed . These seem to be the most recent additions. To 
discover more a date could be used as an added restricting search term. 

information found, or selected parts of it, can readily be printed out to 
paper. . . 
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The Serials Directory/EBSCO CD-ROM 

Bell , Suzanne S 

information Today vl0n9 PP: 24-25+ Oct 1993 
ISSN: 8755-6286 JRNL CODE: I FT 
WORD COUNT: 1956 

...TEXT: executed in the CD version. 

wild card and righthand truncation may be used in most search fields, 
including Search Limiters. Searching for phrases containing punctuation or 
stopwords is remarkably easy, unlike... 

...bank for america or bank of the america). The program returns any 
variation of hyphenated terms (for example, post-adolescent, post 
adolescent, postadolescent) . This forgiving feature greatly reduces the 
number of times a query returns a puzzling and frustrating "No Hits 



AUTHORITY FILES SEARCHING 

The Authority File search method allows the user direct access to the 
lists of valid entries for the Subject, Title , Publisher, and index and 
Abstract indexes. This second search option is useful, especially when 
the searcher is looking for specific or known information. Since one is 
directly accessing the index files (indexes here in a database sense), the 
system executes searches much more quickly in the Authority Files than in 
the Query Profile. 

At user levels 3 and above, searches from either the Query Profile or 
the Authority Files may be saved, then recalled and re-executed at a... 

- - -RESULTS 

A brief version of the result list is automatically displayed upon 
completion of a search . The user may browse the list in several ways: 
using the arrow, page up/down... 
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Text retrieval - windows file indexers 

Marshall, Patrick; watt, Peggy 

infoworld vl5n21 PP: 123-140 May 24, 1993 

ISSN: 0199-6649 JRNL CODE: IFW 

WORD COUNT: 13999 

...TEXT: term must come first- The program can do fancy tricks such as 
range and quorum searches . 

Finally, one of Zylndex's most powerful capabilities is its field 



searching, you can define. 



...difficulty on our progressive search task. 

with its clearly laid-out buttons to access previous searches , fields , 
and concepts, zyindex's Search Request screen is an exemplar of 
efficiency. You will also find buttons that pop up a thesaurus for 
search terms and online help to construct search arguments. Vocabulary 
shows a list of the entire contents of the index along with the number 
of times each word appears. 

The program offers flexibility for getting the actual data once you have 
retrieved a set of files. You can set Search Results display in either 
of two modes: a simple listing of retrieved files that shows the number 
of hits in each (along with the file's path... 

...in context) view that displays each hit in the context of its 
surrounding text. 

The Search Results screen also offers a row of buttons that will return 
you to the search screen, show the highlighted file, print file, or 
launch them into their parent application. Once... 
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00712949 93-62170 
Multimedia in brief 

Nordgren, Layne 

CD-ROM Professional v6n3 PP: 133-136 May 1993 
ISSN: 1049-0833 JRNL CODE: LDP 
WORD COUNT: 2528 

...TEXT: question mark (?) for a single character and asterisk (*) for a 
string of characters. Reset and Search buttons are available from the 
bottom of the template. 

when you start a search, the... 

...display and you can either cancel the search or choose OK to bring up 
the title list . 

The title list window displays the total number of articles and 
occurrences and then each article proceeded by the number of 
occurrences within the article. You can sort the title list 
alphabetically or by occurrences, when you open an article the search 
terms appear in bold text; "remote control" arrows appear at the top left 
of the screen for jumping to the next, previous, beginning, or end 
occurrence. 

The Search menu includes a choice for setting search options. A 
limiting feature allows you to select one or more alternatives among 
titles, text, bibliographies, fact boxes, and picture captions. A proximity 
box sets phrase searching to the same article, the same paragraph, or 
within a specified number of words. An... 

...to specify whether the word must be found in the same order specified by 
the search . 

The Timeline and Knowledge Tree provide additional search strategies. The 
Timeline includes over 5,000... 
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Stepping out: Let the web be your guide (Online Connections-Exploring The 
information Highway ) 
John Eckhouse 

HOME PC, 1997, n 409, PG161 

PUBLICATION DATE: 970901 

JOURNAL CODE: HPC LANGUAGE: English 

RECORD type: Full text 

SECTION HEADING: Electronic Communities 

WORD COUNT: 1884 

had in mind was a French bistro in the South of Market area, so I 
searched those categories using the pull -down selection boxes and found 
Palomino, with its "opulent decor... 

...Bay and well -executed cuisine." Then I used the what Else is Nearby 
tool to find Harry Denton's, a watering hole with live music and 
dancing. 

Some of the listings are disappointing, though, when I did a search 
of movies in the vicinity of Fisherman's wharf, a major tourist 
attraction, CitySearch found only one theater-even though I know there 
are several in the area . Some of the movie listings fail to include 
the theater's address, phone number or show times , and you won't 
find links to movie reviews. Many other listings , including those 
for museums, lack graphics or photos and consist of just two lines of 
text with a name , address and phone number. To learn more, you have to 
jump to another page. And while CitySearch has a powerful search 
engine, it covered events only within San Francisco's city limits. 

CitySearch's handiest feature... 
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united airlines EBT: United Airlines Delivers Aircraft Maintenance 
information With Electronic Book Technologies 1 Dynatext 

August 1, 1994 

Byline: Business Editors 

...built automatically from the 
structures in the source SGML documents—that enables them to easily 

locate and navigate to a desired piece of information (text, 
graphics, tables , etc.). Engineers can also search for information 
by typing in words or phrases . The TOC instantly indicates the 
locations and number of search occurrences . in addition, engineers 
can annotate reference material for public or private viewing and 
create their own hypertext links to associated material for efficient 
cross- referenci ng . 

DynaText, introduced in August of 1990, is the... 
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0395976 BW783 



ELECTRONIC BOOK TECH: Sybase To Deliver Software Documentation On CD-ROM 
With Electronic Book Technologies 1 Dynatext 

April 5, 1994 

Byline: Business Editors 

...built automatically from the structures in the source 
SGML documents-that enables them to easily locate and navigate to a 
desired piece of information (text, graphics, tables , etc.). 
Customers can also search for information by typing in key words or 
phrases . The TOC instantly indicates the locations and number of 
search occurrences . in addition, customers can annotate reference 
material for public or private viewing and create their own hypertext 
links to associated material for efficient cross- referencing. 
DynaText, introduced in August of 1990, is the... 

...SGML document and automatically builds a dynamic electronic 
book that enables users to quickly browse, search , and annotate 
large, highly structured documents. The electronic books can be 
shared among heterogeneous client... 
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elect book tech: Information Handling services Selects Electronic Book 
Technologies 1 DynaText To Deliver Electronic information 

February 22, 1994 

Byline: Business Editors 

...built automatically from the structures in the source 
SGML documents — that enables them to easily locate and navigate to 
a desired piece of information (text, graphics, tables , etc.). 
Engineers can also search for information by typing in key words or 

phrases . The dynamic table of contents instantly indicates the 
locations and number of search occurrences . in addition, engineers 
can annotate reference material for private or public viewing and 
create their own hypertext links to associated material for efficient 
cross ref erenci ng . 

DynaText, introduced in August of 1990, represented the... 

...SGML document and automatically builds a 

dynamic electronic book that enables users to quickly browse, search , 
and annotate large, highly structured documents. The electronic 
books can be shared among heterogeneous networks... 
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Dow Jones interactive Next-Generation server software Eases integration of 
Global News and Business information into Corporate intranets 

Business wi re 

Tuesday, June 8, 1999 19:20 EDT 

JOURNAL CODE: BW LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT 
DOCUMENT TYPE: NEWSWIRE 
WORD COUNT: 1,053 



...users and repeatedly send 

them directly to their intranet. Administrator interface features also 
include usage reports , allowing managers to track the number of visits 
to specific areas of the intranet andthe number of times articles 
have 

been viewed. 

Editorial interface Offers Control and Eases Distribution Articles 
from Dow Jones are drawn from thousands of publications in Dow Jones 
interactive to match topic profiles customers have requested. Content 
managers can view an index of folders and articles... 
. . .e-mail articles, 

add commentary, mark articles "hot," and delete articles as necessary. 
To ease searching on their intranets, they can also apply Dow Jones and 
internal coding to articles. 

Leveraging. . . 
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PlanetResume.com and WCVB-TV Channel 5 Form Partnership 

PR Newswi re 

Wednesday, August 25, 1999 14:38 EDT 
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...largest audience available. 

"Classified" sponsors will receive, in addition to on-air exposure, a 

job 

listing and hyperlink on the promoted site. They will also have the 
abi 1 i ty 

to update job listings an unlimited number of times . Ultimately, 
this 

opportunity will expose their employment openings to over 15 million web 
hits 

per month. 

About PlanetResume.com 

PlanetResume.com is a successful Internet recruitment site that focuses 



...personalized approach to recruiting which includes one-to-one training 
and 

service representatives who perform searches for clients. Client 
companies 

are able to advertise free, unlimited job postings. PlanetResume.com has 
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Set Items Descri pti on 

51 403207 TRADEMARK? ? OR TRADE () MARK? ? OR TRADENAME? ? OR NAME? ? - 

OR LOGO OR LOGOS 

52 842344 KEYWORD? ? OR WORD? ? OR TERM? ? 

53 3559 (Sl:S2 OR MATCH??? OR HIT OR HITS OR RESULT???) (5N) (HIGHLI- 

GHT? OR HILIGHT? OR HILIT??? OR (HI OR HIGH) () (LIT??? OR LIGH- 
T???)) 

54 55910 (NUMBER OR COUNT???) (3W) (OCCURRENCES OR TIMES) 

55 1870 COUNT???(3N)OCCURRENCE? ? 

56 15047 PART(3W)(WEBPAGE? ? OR PAGE? ? OR DOCUMENT? ? OR ARTICLE? ? 

OR WEBSITE? ? OR SITE? ? OR RECORD? ? OR FILE? ?) 

57 192 S4:S5(10N)(Sl:S2 OR MATCH??? OR HIT OR HITS OR PHRASE? ? OR 

STRING? ?)(10N)(METATAG? ? OR META()TAG? ? OR HIDDEN OR TITL- 
E? ? OR HYPERLINK? ? OR LINK? ? OR PARAGRAPH? ? OR S6) 

58 695 S4:S5(10N)(Sl:S2 OR MATCH??? OR HIT OR HITS OR PHRASE? ? OR 

STRING? ?) (ION) (SECTION? ? OR PORTION? ? OR AREA? ? OR PIECE? 
? OR SEGMENT? ? OR CATEGOR??? OR FIELD? ? OR S6) 
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2328650 


COLUMN?? OR GRID? ? OR ARRAY? ? OR TABLE? ? OR LIST???? OR 




REPORT??? 
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2089632 


SEARCH??? OR QUERY??? OR QUERIE? ? OR RETRIEV??? 
OR DISCOVER??? OR LOCATE? ? OR LOCATING 


OR FIND??? 


Sll 


170 


S9(10N)S7:S8 




S12 


79 


S10(100N)S11 




S13 


23 


S12 AND AC=US/PR AND AY=(1978 : 1999)/PR 




S14 


23 


S12 AND AC=US AND AY=1978 : 1999 




S15 


23 


S12 AND AC=US AND AY= (1978 : 1999) /PR 




S16 


22 


S12 AND PY=1978:1999 




S17 


32 


S13:S16 




S18 


32 


ID PAT (sorted in duplicate/non-duplicate order) 




S19 


15010 


COUNT??? (5N) (SI :S2 OR MATCH??? OR HIT OR HITS OR 


PHRASE? ? 




OR STRING? ?) 




S20 


1816 


S19(10N)(METATAG? ? OR META()TAG? ? OR HIDDEN OR 


TITLE? ? - 



OR HYPERLINK? ? OR LINK? ? OR PARAGRAPH? ? OR SECTION? ? OR P- 
ORTION? ? OR AREA? ? OR PIECE? ? OR SEGMENT? ? OR CATEGOR??? - 





OR 


FIELD? ? OR S6) 


S21 


306 


S9(10N)S20 


S22 


73 


S10(50N)S21 


S23 


59 


S22 NOT S12 


S24 


20 


S23 AND AC=US/PR AND AY=(1978 : 1999)/PR 


S25 


20 


S23 AND AC=US AND AY=1978:1999 


S26 


20 


S23 AND AC=US AND AY= (1978 : 1999) /PR 


S27 


14 


S23 AND PY=1978:1999 


S28 


24 


S24:S27 


S29 


24 


IDPAT (sorted in duplicate/non-duplicate order) 
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REAL TIME STRUCTURED SUMMARY SEARCH ENGINE 

ECHTZEITSUCHMOTOR MIT STRUKTURIERTEN ZUSAMMENFASSUNGEN 

MOTEUR DE RECHERCHE SOMMAIRE STRUCTURE FONCTIONNANT EN TEMPS REEL 

PATENT ASSIGNEE: 

March Networks Corporation, (2652171), Tower 2, 5th floor, 555 Legget 
Drive, Kanata, Ontario k2k 2x3, (CA) , (Proprietor designated states: 
all) 

INVENTOR: 

reed, Jim, 26 Sheperds Glen Avenue, Kanata, Ontario K2M 2m9, (CA) 
STREATCH, Paul, P.O. Box 1196, Richmond, Ontario K0A 2z0, (CA) 

LEGAL REPRESENTATIVE: 

McLean, Robert Andreas et al (88231), 25 The Square, Martlesham Heath, 
Ipswich IPS 3SL , (GB) 
PATENT (CC, no, Kind, Date): EP 922260 Al 990616 (Basic) 

EP 922260 Bl 030129 
WO 98009229 980305 
APPLICATION (CC, No, Date): EP 97937389 970829; WO 97CA611 970829 
PRIORITY (CC, No, Date): CA 2184518 960830 
DESIGNATED STATES: DE; FR; GB 

INTERNATIONAL PATENT CLASS (V7) : G06F-017/30 
NOTE: 

No A-document published by EPO 
LANGUAGE (Publ i cati on , Procedural ,Appl i cation) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language Update word Count 

CLAIMS B (English) 200305 669 

CLAIMS B (German) 200305 665 

CLAIMS B (French) 200305 777 

SPEC B (English) 200305 2597 
Total word count - document A 0 
Total word count - document B 4708 
Total word count - documents A + B 4708 

...specification the next unique field name in the summary structure 
database starting from the first, and at 13 retrieves from the summary 
candidate database the next summary candidate (selected candidate) also 
starting from the first having a field name matching the summary record 
field name that has just been set. For example, the first summary 
record field name might be " Category The first summary candidate 
with a field name category might be "Financial" having the criteria 
keywords noted above. 

Next, the number of occurrences of each word on the criteria 
word list in the current document for the selected candidate 
("Financial") is determined at 14 and these occurrences are... 
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Machining method using numerical control apparatus 

Bearbei tungsverfahren mit Verwendung von einem numerischen Steuerungsgerat 
Methode d'usinage utilisant un appareil a commande numerique 

PATENT ASSIGNEE: 

MITSUBISHI DENKI KABUSHIKI KAISHA, (208580), 2-3, Marunouchi 2-chome 
Chiyoda-ku, Tokyo 100, (JP), (applicant designated states: 
CH;DE;FR;GB;LI) 
INVENTOR: 

Hirai, Hayao, c/o Mitsubishi Denki K.K., Nagoya Seisakusho, 1-14, 



Yadami nami 5-chome, Higashi-ku, Nagoya-shi , Aichi 461, (JP) 
Fu j i moto , Aki hi ko , Mi tsubi shi E . M . S . Co . , Ltd . , 1071 , 

Higashi -Ozone-cho-Kami 5-chome, Kita-ku, Nagoya-shi, Aichi 462-91, (JP) 

LEGAL REPRESENTATIVE: 

Ritter und Edler von Fischern, Bernhard, Dipl.-lng. et al (9672), 
Hoffmann Eitle, Patent- und Rechtsanwalte, Arabellastrasse 4, 81925 
Munchen, (DE) 

PATENT (CC, No, Kind, Date): EP 753805 Al 970115 (Basic) 

EP 753805 Bl 990506 
APPLICATION (CC, No, Date): EP 96111105 960710; 
PRIORITY (CC, No, Date): JP 95197308 950710 
DESIGNATED STATES: CH ; DE; FR; GB; LI 
INTERNATIONAL PATENT CLASS (V7) : G05B-019/418 ; 
ABSTRACT WORD COUNT: 173 

LANGUAGE (Pub! i cati on , Procedural ,Appl i cati on) : English; English; English 

FULLTEXT AVAILABILITY: 

Available Text Language Update Word Count 

CLAIMS B (English) 9918 2061 

CLAIMS B (German) 9918 1991 

CLAIMS B (French) 9918 2306 

SPEC B (English) 9918 189869 
Total word count - document A 0 
Total word count - document B 196227 
Total word count - documents A + B 196227 

...SPECIFICATION the X and Z directions in combination, which qualitatively 
requires much shorter machining time, is best in terms of the number 
of times the material is cut and the machining length over which the 
material is fed; 

deciding which of... 
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Method for performing a search of a plurality of documents for similarity 
to a query 

verfahren zur Durchfuhrung der Suche nach Ahnl ichkeiten mit einer Abfrage 

in einer Dokumentenmenge 
Methode pour effectuer une recherche de similarite avec une requete dans un 

ensemble de documents 

PATENT ASSIGNEE: 

XEROX CORPORATION, (219783), Xerox Square, Rochester, New York 14644, 
(US), (Proprietor designated states: all) 
INVENTOR: 

Henderson, Richard D. , 505 Aleta Avenue, San Jose, California 95128, (US) 
Barbarino, Michael J., 363 California Street, Moss Beach, California 
94038, (US) 
LEGAL REPRESENTATIVE: 

Skone James, Robert Edmund et al (50281), GILL JENNINGS & EVERY Broadgate 
House 7 El don Street, London EC2M 7LH, (GB) 
PATENT (CC, No, Kind, Date): EP 590858 Al 940406 (Basic) 

EP 590858 Bl 010905 
APPLICATION (CC, No, Date): EP 93307488 930922; 
PRIORITY (CC, No, Date): US 953166 920929 
DESIGNATED STATES: DE; FR; GB 
INTERNATIONAL PATENT CLASS (V7) : G06F-017/30 
ABSTRACT WORD COUNT: 175 
NOTE: 

Figure number on first page: 2 
LANGUAGE (Publ i cati on , Procedural ,Appl i cati on) : English; English; English 



FULLTEXT AVAILABILITY: 

Available Text 

CLAIMS A 
CLAIMS B 
CLAIMS B 
CLAIMS B 
SPEC A 
SPEC B 
word count 



Total 
Total 



Language 
(English) 
(English) 
(German) 
(French) 
(English) 
(English) 
- document 



Update 

EPABF2 
200136 
200136 
200136 
EPABF2 
200136 

A 



word count - document B 



Total word count - documents A + B 



Word Count 

515 

502 

484 

538 
2126 
2206 
2641 
3730 
6371 



..specification been determined in each of the plurality of documents. 

The query word can include a plurality of query terms, all of which 
are searched in each document, in turn, rather than being searched term 
by. . . 

..is produced according to the document ranking. 

in one embodiment, a list of words contained within the retrieved 
document is generated, and the query words are compared to the 
generated list of words. 

in another embodiment, all of the query words are compared against a 
first portion of the documents. Subsequently, all of the query words 

are compared against a second portion of the documents. The documents 
are then ranked, according to the number of occurrences of the 
query words determined in each document, and a list of the documents 
is generated according to the document ranking. 

in another embodiment, the documents are organized into an inverted 
index, in this case, instead of retrieving a document, the segment of a 
list of document-id and term-frequency pairs related to the query term 
and the document is examined. 

The present invention further provides a programmable document 
searching system when suitably programmed for carrying out the method of 
any of claims 1 to 10. 

The. . . 



. .accompanying 

Figure 1 is a 
similarity search 
prior art; and 

Figure 2 is a block. 



)lo 
of a 



in which: 



drawinc 

block diagram outlining the steps for performing a 



corpus of documents, in accordance with the 



documents, 
all of which 
produced 

retrieved 

to the 



..SPECIFICATION been determined in each of the plurality of 
The query word can include a plurality of query terms, 
are searched in each document, in turn, rather than ...is 
according to the document ranking. 

In one embodiment, a list of words contained within the 
document is generated, and the query words are compared 
generated list of words. 

in another embodiment, all of the query words are compared against a 
first portion of the documents. Subsequently, all of the query words 

are compared against a second portion of the documents. The documents 
are then ranked, according to the number of occurrences of the query 

words determined in each document, and a list of the documents is 
generated according to the document ranking. 

In another embodiment, the documents are organized into an inverted 
index, in this case, instead of retrieving a document, the segment of a 
list of document-id and term-frequency pairs related to the query term 
and the document is examined. 

The invention is illustrated in the accompanying drawing, in which: 
Figure 1 is a block diagram outlining the steps for performing a 

"a corpus of documents, in accordance with the 



of 



simi larity search 
prior art; and 

Figure 2 is a block 
similarity search of 



diagram outlining the steps for performing a 
a corpus of documents in accordance with a 



preferred embodiment of the invention. 
The present invention... 

...claims of occurrences that a word appears in the identified document; 
and wherein the method of comparing the query words against the 
generated list of words comprises for each document identifier, in 
turn, comparing each of said plurality of query words to each word 
coupled with each document identifier. 
8. A method according to claim 6, when dependent on claim 2, wherein the 
method of generating a list of words comprises generating an index 
of entries for all words of a portion of all said documents, each 
of said documents being identified by a document identifier, each 
entry containing a document identifier and a number of occurrences 

that a word appears in the identified document; and wherein the 
method of comparing the query words against the generated list 
of words comprises for each document identifier, in turn, comparing 
each of said plurality of query words to each word coupled with 
each document identifier. 
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Method and apparatus for document image processing 
verfahren und Gerat zur Dokumentbildverarbeitung 
Procede et appareil de traitement d 1 images de documents 

PATENT ASSIGNEE: 

XEROX CORPORATION, (219783), Xerox Square, Rochester, New York 14644, 
(US), (applicant designated states: DE ; FR; GB) 
INVENTOR: 

withgott, m. Margaret, 11 Carriage Court, Los Altos, California 94022, 
(US) 

Rao, Ramana R., 50 ina Court, San Francisco, California 94112, (US) 
LEGAL REPRESENTATIVE: 

Mackett, Margaret Dawn et al (60332), Rank Xerox Ltd Patent Department 
Parkway, Marlow Buckinghamshire SL7 lYL, (GB) 
PATENT (CC, No, Kind, Date): EP 544433 A2 930602 (Basic) 

EP 544433 A3 931222 
EP 544433 Bl 980527 
APPLICATION (CC, No, Date): EP 92310434 921116; 
PRIORITY (CC, No, Date): US 794555 911119 
DESIGNATED STATES: DE; FR; GB 

INTERNATIONAL PATENT CLASS (V7) : G06K-009/00; G06K-009/72 ; 
ABSTRACT WORD COUNT: 116 

LANGUAGE (Pub! i cati on , Procedural , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language Update word Count 

CLAIMS B (English) 9822 561 

CLAIMS B (German) 9822 449 

CLAIMS B (French) 9822 635 

SPEC B (English) 9822 3866 
Total word count - document A 0 
Total word count - document B 5511 
Total word count - documents A + B 5511 

...SPECIFICATION with which some or all of the words occur. For example, 
Salton & McGill, introduction to Modern Information Retrieval , Chapter 
2, pp. 30, 36, McGraw-Hill, inc., 1983, indicates that in information 
retrieval contexts, the frequency of use of a given term may correlate 
with the importance of that term... 

...be useful for automatic document summarization and/or annotation. Word 



frequency information can also be used in locating , indexing, filing, 
sorting, or retrieving documents. 

Another use for knowledge of word frequency is in text editing. For 
example, one text processing device has been proposed for preventing the 
frequent use of the same words in a text by categorizing and displaying 
frequently occurring words of the document. A list of selected words 

and the number of occurrences of each word is formulated for a 
given text location in a portion of the text, and the designated word 
and its location is displayed on a CRT. 

An extension of this thesis is that knowledge of... 

..also is useful, for example, for automatic document summarization. 
Phrase frequency information can also be used in locating , indexing, 
filing, sorting, or retrieving documents. 

Heretofore, though, word frequency determinations have been performed 
on electronic texts in which the contents have... 
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Method and apparatus for determining the frequency of words in a document 

without document image decoding 
Verfahren und Gerat zur Bestimmung der wortfrequenz in einem Dokument ohne 

Dokumentbi 1 ddekodi erung 
Procede et appareil de determination de la frequence de mots dans un 

document sans decodage de 1 'image du document 

PATENT ASSIGNEE: 

XEROX CORPORATION, (219783), Xerox Square, Rochester, New York 14644, 
(US), (applicant designated states: DE ; FR;GB) 
INVENTOR: 

Cass, Todd A., 107 Hammond Street, Cambridge, Massachusetts 02138, (US) 
Huttenlocher, Daniel P., 314 Comstock Road, Ithaca, New York 14580, (US) 
Halvorsen, Per-Kri stian , 11 Carriage Court, Los Altos, California 94022, 
(US) 

withgott, M . Margaret, 11 Carriage Street, Los Altos, California 94022, 
(US) 

Kaplan, Ronald M. , 4015 Orme Street, Palo Alto, California 94306, (US) 
Rao, Ramana B., 50 Ina Court, San Francisco, California 94112, (US) 
LEGAL REPRESENTATIVE: 

Skone James, Robert Edmund et al (50281), GILL JENNINGS & EVERY Broadgate 
House 7 Eldon Street, London EC2m 7lh, (GB) 
PATENT (CC, No, Kind, Date): EP 544430 a2 930602 (Basic) 

E p 544430 A3 931222 
EP 544430 Bl 990623 
APPLICATION (CC, No, Date): EP 92310431 921116; 
PRIORITY (CC, No, Date): US 795173 911119 
DESIGNATED STATES: DE; FR; GB 

INTERNATIONAL PATENT CLASS (V7) : G06K-009/00; 
ABSTRACT WORD COUNT: 59 

LANGUAGE (Publ i cati on , Procedural ,Appl i cation) : English; English; English 

FULLTEXT AVAILABILITY: 

Available Text Language Update word Count 



CLAIMS B (English) 9925 453 

CLAIMS B (German) 9925 401 

CLAIMS B (French) 9925 539 

SPEC B (English) 9925 3964 

Total word count - document A 0 

Total word count - document B 5357 
Total word count - documents A + B 5357 



.SPECIFICATION with which some or all of the words occur. For example, 



Salton & McGill, introduction to Modern information Retrieval , Chapter 
2, pp. 30, 36, McGraw-Hill, inc., 1983, indicates that in information 
retrieval contexts, the frequency of use of a given term may correlate 
with the importance of that term... 

..be useful for automatic document summarization and/or annotation, word 
frequency information can also be used in locating , indexing, filing, 
sorting, or retrieving documents. 

Another use for knowledge of word frequency is in text editing. For 
example, one text processing device has been proposed for preventing the 
frequent use of the same words in a text by categorizing and displaying 
frequently occurring words of the document. A list of selected words 

and the number of occurrences of each word is formulated for a 
given text location in a portion of the text, and the designated word 
and its location is displayed on a CRT. 

Heretofore, though, such word frequency determinations have been 
performed . . . 
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Heterogeneous software configuration management apparatus. 
Heterogene Softwarekonfigurationsverwaltungsvorricntunq. 
Dispositif heterogene de gestion de configurations de Togiciels. 

PATENT ASSIGNEE: 

Hewlett-Packard Company, (206030), 3000 Hanover Street, Palo Alto, 
California 94304, (US), (applicant designated states: DE ; FR ; GB) 
INVENTOR : 

Robinson, Douglas B. , 7 Crestwood Drive, Hollis, NH 03049, (US) 
Lubkin, David C, 11 westray Drive, Nashua, NH 03062, (US) 
Leach, Paul 3., 23 Swan Road, Winchester, MA 01890, (US) 
McCue, Daniel, Computing Lab. Univ. of Newcastle upon Tyne, Newcastle, 
NEl 7RU, (GB) 

Chase, Robert P., Jr., Millenium Teamware, 24 Prime Park Way, Natick, ma 
01760, (US) 
LEGAL REPRESENTATIVE: 

Powell, Stephen David et al (52311), WILLIAMS, POWELL & ASSOCIATES 34 
Tavistock Street, London WC2E 7pb, (GB) 
PATENT (CC, No, Kind, Date): EP 501613 A2 920902 (Basic) 

EP 501613 A3 930901 
APPLICATION (CC, No, Date): EP 92300824 920130; 
PRIORITY (CC, No, Date): US 662561 910228 
DESIGNATED STATES: DE; FR; GB 

INTERNATIONAL PATENT CLASS (V7) : G06F-009/44; G06F-009/46; 
ABSTRACT WORD COUNT: 173 

LANGUAGE (Publ i cati on , Procedural ,Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language Update word Count 

CLAIMS A (English) EPABFl 633 

SPEC A (English) EPABFl 10561 
Total word count - document A 11194 
Total word count - document B 0 
Total word count - documents A + B 11194 

...SPECIFICATION 15b have different capabilities, the names of certain 
builders 13 and certain helper nodes 15b may be listed more than once 
in the respective builder and helper fields of the builder list file 
23 and default file. The number of times the names of a foreign 
builder 13 or helper node 15b is listed in the respective builder and 
helper field for a given host type, indicates the relative power among 
the other foreign builders 13 or helper nodes 15b respectively. 



Once the hcm tool 17 has located a builder list file 23, the tool 17 
checks a flag or other indicator of the file... 
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System for checking the translation of a document. 
System zur Prufung der ubersetzung eines Dokuments. 
Systeme de verification de la traduction d'un document. 

PATENT ASSIGNEE: 

THE BRITISH AND FOREIGN BIBLE SOCIETY, (1458170), Stonehill Green, 
westlea, Swindon SN5 7DG, (GB) , (applicant designated states: 
AT ; BE ; CH ; DE ; DK ; ES ; FR ; GB ; GR ; IT ; LI ; LU ; MC ; NL ; PT ; SE) 
INVENTOR: 

Robinson, David William Clough, 6 Hillside Mansions, Barnet Hill, 
Chipping Barnet, Hertfordshire EN 5 5RH, (GB) 

LEGAL REPRESENTATIVE: 

Newstead, Michael John et al (34354), Page Hargrave Temple Gate House 
Temple Gate, Bristol BSl 6pl, (GB) 
PATENT (CC, No, Kind, Date): EP 499366 A2 920819 (Basic) 

EP 499366 A3 931020 
APPLICATION (CC, No, Date): EP 92300597 920123; 
PRIORITY (CC, No, Date): GB 9103080 910214 

DESIGNATED STATES: AT; BE; CH ; DE; DK; ES ; FR; GB ; GR; IT; LI; LU; MC; NL; 
PT; SE 

INTERNATIONAL PATENT CLASS (V7) : G06F-017/27; G06F-017/28; 
ABSTRACT WORD COUNT: 72 

LANGUAGE (Publ i cation , Procedural ,Appl i cation) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS A (English) EPABFl 417 

SPEC A (English) EPABFl 8305 
Total word count - document A 8722 
Total word count - document B 0 
Total word count - documents A + B 8722 

...specification 68 is a list of records 69a-n, each record 69a-n 
corresponding to one of the words 64a-n and being a record of the 
number of times that that word 64a-n has been found to occur in the 
source document 60. 

The source word list 63, the source word occurrence set 65 and 
the source word frequency list 68 may be compiled by taking each word 
62m-z in each segment 61a-n in turn, searching for it in the source 
word list 63; appending it to the source word list 63 if a list of 
records 89a-n, each record 89a-n corresponding to one of the words 
84a-n and being a record of the number of times that that word 
84a-n has been found to occur. 

The target word list 83, the target word occurrence set 85 and 
the target word frequency list 88 may be complied by taking each word 
82m-z in each segment 81a-n in turn, searching for it in the target 
word list 83; appending it to the target word list 83 if... 

...in the target document, are carried out recursively, as is indicated in 
Figure 4. 
ELEMENT 2 

To locate a pair of words, one from the source document 60 and one 
from the target document 80... 

...taken in turn and is paired, in turn, with each word 84a-n which occurs 
in a segment which is included in the target word occurrence list 
67aa-nn corresponding to that word 64a-n. For each pairing, for instance 



of a word from the source document designated x and a word from the 
target document designated y, the following values are determined: 
m(x,y) - the number of times that the word x and the 

word y occur in corresponding segments 61a-n and 81a-n in the source 
and target documents 60 and 80 (this may be... 
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Data compression method and apparatus 

Datenkompressionsmethode und Gerat 

procede et appareil de compression de donnees 

PATENT ASSIGNEE: 

FUJITSU LIMITED, (211460), 1015, Kami kodanaka , Nakahara-ku, Kawasaki -shi , 
Kanagawa 211, (JP), (applicant designated states: de;fr;GB) 

INVENTOR: 

Yoshida, Shigeru, 876-1-406, Kokubu, Ebina-shi, Kanagawa, 243-04, (JP) 
Okada, Yoshiyuki, 256-1-405, Tanaka, isehara-shi , Kanagawa, 259-11, (JP) 
Nakano, Yasuhiko, 976-1-202, Tsumada, Atsugi-shi, Kanagawa 243, (JP) 
Chiba, Hirotaka, 2-3-10, Sakaecho, Atsugi-shi, Kanagawa 243, (JP) 

LEGAL REPRESENTATIVE: 

Billington, Lawrence Emlyn et al (28331), HASELTINE lake & CO Hazlitt 
House 28 Southampton Buildings Chancery Lane, London WC2A 1AT, (GB) 
patent (CC, no, Kind, Date): EP 471518 Al 920219 (Basic) 

EP 471518 Bl 961218 
APPLICATION (CC, No, Date): EP 91307343 910809; 
PRIORITY (CC, No, Date): JP 90213990 900813; JP 90281431 901019; JP 

90281432 901019; JP 90281433 901019 
DESIGNATED STATES: DE; FR; GB 

INTERNATIONAL PATENT CLASS (V7) : H03M-007/42 ; 
ABSTRACT WORD COUNT: 96 



LANGUAGE (Publ i cati on , Procedural Application) 
FULLTEXT AVAILABILITY: 

Available Text Language 
(English) 
(English) 
(German) 
(French) 
(English) 
(English) 
word count - document A 
word count - document B 
word count - documents A 



English; English; English 



CLAIMS 
CLAIMS 
CLAIMS 
CLAIMS 
SPEC A 
SPEC B 



Total 
Total 
Total 



Update 

EPABFl 
EPAB96 
EPAB96 
EPAB96 
EPABFl 
EPAB96 



+ B 



Word Count 
1640 
1477 
1306 
1605 
12907 
8641 
14547 
13029 
27576 



..SPECIFICATION table 10a and a partial string table lOb(min). As shown in 
FIG.21A, the partial string table lOb(min) has a counter area and a 
carry flag area in addition to the aforementioned reference number 
(index i) and the extension character Ext(i). The counter area stores 
the number of times that the corresponding partial string has been 
accessed. The carry flag area stores a carry flag indicating whether or 
not the value of the corresponding counter area has overflowed... 

..value becomes greater than a predetermined threshold counter value). 

Referring to FIG.21B, the dictionary 220 is retrieved at step S201, 
and the counter value in the counter area corresponding to the partial 
string which is searched for is incremented at step S204. At subsequent 
step S205, it is determined whether or not the... 



..specification table 10a and a partial string table lOb(minutes) . As 
shown in FIG.21A, the partial string table lOb(minutes) has a counter 
area and a carry flag area in addition to the aforementioned 



reference number (index i) and the extension character Ext(i). The 
counter area stores the number of times that the corresponding 
partial string has been accessed. The carry flag area stores a carry 
flag indicating whether or not the value of the corresponding counter 
area has overflowed... 

...value becomes greater than a predetermined threshold counter value). 

Referring to FIG.21B, the dictionary 220 is retrieved at step S201, 
and the counter value in the counter area corresponding to the partial 
string which is searched for is incremented at step S204. At 
subsequent step S205, it is determined whether or not the... 
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...SPECIFICATION system (103) receives the input passwords. Then, the 
password check section (101) references the data information holding 
section (86) to check whether the input passwords match the previously 
stored (defined) passwords (88). 

If even one of the m input passwords does not match the defined 
passwords (88), processing items 3 and later are repeated, if the 
processing items are repeated the prescribed number of times , the 
data request is not accepted. 

7 if all the m input passwords match the passwords (88), the 
password check section (101) reports it to the data fetch section 
(102) . 

when receiving a report from the data information fetch section (99) or 
password check section (101), the data fetch section (102) retrieves 
the data (87) corresponding to the reference data name. Then, the data 
(87) is sent from the... 

...and multiple passwords according to the importance of data can realize 
the following: (1) Even if someone finds the user ID and a password, 
referencing of important data importance can be prevented. (2) High-level 
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FULLTEXT 

Available Text 

CLAIMS A 
CLAIMS B 
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CLAIMS B 
SPEC A 
SPEC B 

Total word count 
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Total word count 
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Language 
(English) 
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(German) 
(French) 
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- document 
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EPAB95 
EPAB95 
EPABFl 
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- document B 
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Word Count 
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.SPECIFICATION search patterns with missing or extra characters. Another 
— 0 f the invention is the ability to search a text 



important aspect 
stream in regions 
pages of text, or 



called segments, which may be, for example, sentences, 



..contiguous 200 characters of each other, this would be an example of a 
search using a "sliding" segment or window. 

The search processor of the invention is capable of performing an 
"enumerated match" function within a specified segment . An enumerated 
match is defined as a search condition which specifies that the 
search processor will report a match only if the number of 
occurrences of a pattern within a text segment is greater than, less 
than or equal to a specified number. 

The search processor is also capable of performing an "enumerated 
subset" function, which means that a match is reported... 

..there are a designated number of occurrences of various patterns 
selected from a set, or list, of search patterns. For example, a 
search could be defined to locate at least two of a set of three 
search terms 'a 1 , 'b' and 'c' within a specified segment, or to locate 



two terms from the 
four 'b"\ and "at. 



set consisting of "at least three "a 1 



.SPECIFICATION contiguous characters or words. A sentence 
examples of "fixed" segments or windows. A search could 



combinations 
sentence. 



'at least 

and a page are 
specify various 



of patterns that must be found within a segment, such as a 



.in appended claim 1. 
The method according to the invention 



is recited in appended claim 10. 



The search processor of the invention is capable of performing an 
"enumerated match" function within a specified segment . An enumerated 
match is defined as a search condition which specifies that the 
search processor will report a match only if the number of 
occurrences of a pattern within a text segment is greater than, less 
than or equal to a specified number. 

The search processor is also capable of performing an "enumerated 
subset" function, which means that a match is reported... 

..there are a designated number of occurrences of various patterns 
selected from a set, or list, of search patterns. For example, a 
search could be defined to locate at least two of a set of three 
search terms 'a 1 , 'b' and 'c' within a specified segment, or to locate 



two 
« b ., 



terms 
, and 



from 
"at.. 



the set consisting of "at least three 'a 1 



'at least four 
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Detailed Description 

Detailed Description 

... expand toplevel categories that are of interest to view. 

For selective, keyword, wizard and browse by category searches , clients 
may specify a sorting criteria for presenting search results. According 
to one embodiment, 
14 

four ordering schemes are provided, in alphabetical sort order products 
are. . . 

...mode the resulting products are listed in descending order by their 
usage (usage is defined as the number of times a product's URL is 
clicked) . 

According to one embodiment of the present invention, at the final phase 
of category search , wizard, or keyword search , a client 105 is 
presented with a list of products based on the search criteria. The 
client 105 has the option of making a I 0 demand based on the search . 
This demand item consists of a title of the demand, a description, and 
some other attributes defining. . .vendors . Vendors 191 may browse through 



these demands using the same category tree described in the selective 
search . 

Additionally, these demands may be presented as news for all of the 
vendors to attract 15... 
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Detailed Description 

... to allow sites to be more accurately rated. 

Full -text search and indexing systems such as web search engines 
typically have two distinct means of organizing the presentation of 
documents. The first means is usually... 

...title, date of change or a ranking value based on a calculation whose 
input may come in part from the document content. For example, in 
searching for the work "car" in a set of documents, the resulting list 
of matching documents might be sorted by the number of times the 
word occurred in each document. 



SUMMARY OF THE INVENTION 

According to one aspect of the present invention, one... 

..web site. This conceptual information is then utilized in constructing 
the central catalog so that more accurate search results may be 
generated in 
2 

response to search queries applied to the catalog. This 
categorization information is transmitted by an agent program on the host 
to. . . 
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Detailed Description 

Detailed Description 

... output, as illustrated by TABLE 12, is best suited to determine what 
is happening in a specialized area . The data of TABLE 12 is used to 
make this detern-iination since it only takes into account the number 
of times that a specific keyword was encountered regardless of 
journals' impact factor. The data of 
TABLE 13 may be used as an intermediate output between the data shown 
in 

TABLE 1 1 and 12. The data of TABLE 13 may be the preferred output 
depending on the type of search query . For example, the query may 
combine all of the database's keywords or the query may combine all of 
the journals that the database contains. 



The data of table 14 reveals the. 



18/3.K/20 (Item 20 from file: 349) 

DIALOG (R) Fi 1 e 349:PCT FULLTEXT 

(c) 2006 wiPO/univentio. All rts. reserv. 

00753784 **lmage available** 

METHOD AND APPARATUS FOR CATEGORIZING AND RETRIEVING NETWORK PAGES AND 
SITES 

PROCEDE ET DISPOSITIF SERVANT A CLASSER ET A EXTRAIRE DES PAGES ET DES 
SITES DE RESEAUX 

Patent Applicant/Inventor: 

GRANT Lee H, 4849 El Cemonte #169, Davis, CA 95616, US, US (Residence), 

US (Nationality) 
CAPIZZI Susan A, 4849 El Cemonte #169, Davis, CA 95616, US, US 
(Residence), US (Nationality) 
Legal Representati ve : 

MILLEMANN Audrey A (agent), Weintraub Genshlea & Sproul , 11th floor, 400 
Capitol Mall, Sacramento, CA 95814, US, 
Patent and Priority information (Country, Number, Date): 

Patent: WO 200067161 A2-A3 20001109 (WO 0067161) 

Application: WO 2000US12376 20000503 (PCT/WO US0012376) 

Priority Application: US 99132694 19990504; US 2000565695 20000503 
Designated States: 

(Protection type is "patent" unless otherwise stated - for applications 
prior to 2004) 

AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DZ EE ES FI 
GB GD GE GH GM HR HU ID IL IN IS 3 P KE KG KP LR LS LT LU LV MA MD MG MK 
MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN 
YU ZA ZW 

(EP) AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE 
(OA) BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG 
(AP) GH GM KE LS MW SD SL SZ TZ UG ZW 
(EA) AM AZ BY KG KZ MD RU TJ TM 

Publication Language: English 
Filing Language: English 
Full text word Count: 10226 

Full text Availability: 
Detailed Description 

Detailed Description 

... META tags, and some index other parts of a Web page, such as title, 
headings, etc. Most search engines require a search to be conducted 
by typing in keywords. The way in which the search query is 
formulated may be by 

Boolean logic, where keywords are used with various terms, or by natural 

1 5 language, where keywords are used in the form of a question. Although 
natural language searches may be easier for a user to formulate, both 
types of formulations rely on keywords. 

Most search encrines use mathematical algorithms to weigh or rank the 
tn 

results, with the most relevant items listed first. These rankings may 
be based on 2 0 the number of times a keyword is used on a page or 
the location of the keyword on the page. Some search engines also 
allow the user to organize or group the results 
:n t= 

by category , date, or other variable, such as the folders used by 
Northern Light, U. S. Patent no. 5,924,090 to Krellenstein. Another 
search engine, known as the Clever Pr 'ect, by IBM, analyzes hyperlinks 
between pages, in addition to text... 

. . .OJ 

2 5 citations, in order to develop algorithms that are intended to 
increase the relevancy of search results. This method is a marginal 
improvement over other search engines, but has its own set of problems. 



4 

J A 



"A shortcoming of Clever has been that 
?or . . . 
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Detailed Description 

content of the document input. "Segment importance" is defined as a 
measure of how related a given segment is to presenting key information 
about the article as a whole. The preferred metric, as included in the 
Segmenter code listing of Appendix A, is Term Frequency (TF) * 
Segment Frequency (SF). TF refers to the number of times the tenn 
appears in the document, whereas SF refers to the number of segments 
containing that term . As such, the present invention utilizes a variant 
of Salton T s (1989) information retrieval metric, Term Frequency * 
Inverse Document Frequency (TF*IDF) , to calculate the importance of a 
particular given segment. See G. Sal ton, Automatic Text Proceskm 
The Transformation, Analysis, and Retrieval of information b Co uter 
(Addison-wesley, Reading, Massachusetts 1989). 

intuitively, a segment containing noun phrases used... 
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AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM DZ EE ES 
FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU 
LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT 



TZ UA UG UZ VN YU ZA ZW 

(EP) AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE 

(OA) BF BD CF CG CI CM GA GN GW ML MR NE SN TD TG 

(AP) GH GM KE LS MW SD SL SZ TZ UG ZW 

(EA) AM AZ BY KG KZ MD RU TJ TM 
Publication Language: English 
Filing Language: English 
Full text Word Count: 49717 



Full text Availability: 
Detailed Description 

Detailed Description 

... for a complete re-index each time a document changes. Online 
identifiers may be provided, so that searches can continue while the 
identifiers are modified. This function is also provided by the verity 
software. 



At. . . 



...stored in the term lists 836. For example, a simple weighting algorithm 
might take a single term query , such as a category of information, 
and rank each document in a term list 836 in numerical order 
according to the product of the term frequency (the number of times 
a term appears in the document) and the inverse document frequency (the 
inverse of the number of times the term appears in the entire 
document set) . 

Once the documents are ranked, at a step 30 a list of the ranked 
documents may be further processed by the information retrieval 
software to provide a results page, in particular, at the step 30, the 
information retrieval software 908 may determine cateoories into which 
the retrieved documents fall, in Lin embodiment, the categories are 
yellow pages categories, which have been previously assigned to... 

...entry of the business listings in the Primary Database 812. Thus, at the 
step 30, the information retrieval software 908 determines what 
categories are associated with the business listings retrieved by the 
ranking at the. . .to 

categories is that additional information about the user's preferences 
may be available from the user query . A system that relies only on the 
categories ignores any information from the user query that might 
permit further refinement of the advertisement selection. 
Referring to Figure 70, once the banner ad retrieval software 909 has 
obtained the terms in the user query and the terms in each of the 
matching categories , the terms may be weighted or normalized by the 
number of occurrences of the terms and the number of listings in 
which a ten'n occurs in a step 74. 

Next, at a step 79, the banner ad retrieval software 909 may locate 
the particular terms that appear in the user query and in the 
categories obtained at the steps 60 and 62 in the banner ad term lists... 

...that appears in a user's query or in a category, such as a yellow pacres 
category, retrieved by the information retrieval software 909. Thus, 
for a Criven term, such as "restaurant," a linked list... 

...the term. The elements 74 may include sub-elements, including a document 
identifier 76 for identifying the category and certain statistics 
9 17 C) 

regarding the document, including the term frequency 78, TF, which 
indicates the number of times the term appears in the document, and 
the inverse document frequency 80, IDF, which indicates the inverse of 
the number of times the term appears in the entire set of documents 
that are beina searched . 



From the table of linked lists of super-catecyory terms established 

in the step 77, the 

.D 

banner ad retrieval software 909 may at a step 81 rank the super- 
categories . in particular, the system at the step 81 may rank the 
documents, i.e., the super-categories, according to the appearance of the 
words occurring in the user query and in the categories. 

The ranking may be performed by a variety of techniques. One such 

technique 

in 

obtains a number for each term that appears in the user query and in 
the categories that consists of the product of the term frequency for 
that term and. . . 
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Publication Language: English 
Full text Word Count: 26451 

Full text Availability: 
Detailed Description 

Detailed Description 
song 

DIANA ROSS & LUCIANO PAVORATI collaborated on a song 

2 in the event of CD universe not finding a perfect match, the user 

will have to select from multiple hits on the artist name in... 

...enabled for songs that have multiple licenses. The information included 
will be a subset of the following: 1 
Title 
Composer 
Artist 
Album 
Genre 

Release Date 
Play time 

A list of other artists who have recorded this song (Artist Name , 
Album Title , Release Date) None of this information will have 



underlying links 



Theffimes Recordedcolumn will indicate the number of times the song 
was recorded. It may also be possible to list the songs in descending 
order by the number of times recorded. Alternatively, it may be desirable 
to. . . 

. ..TheActive Flaqcolumn will indicate whether royalties have been paid on 
this song in the last 3 years. 

Search by Artist Hit List 

WNPIF lWMM7M7kY!7M@,FMkll 
ymm Mf I f 

TheCD universe lconwill be the... 
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SYSTEM AND METHOD FOR CORRECTING SPELLING ERRORS IN SEARCH QUERIES 
SYSTEME ET PROCEDE DE CORRECTION D'ERREURS D 1 ORTHOGRAPHE DANS DES DEMANDES 
DE RECHERCHE 
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AE AL AM AT AT AU AZ BA BB BG BR BY CA CH CN CU CZ CZ DE DE DK DK EE EE 
ES FI Fl GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS 
LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM 
TR TT UA UG UZ VN YU ZA ZW GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ 
MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ 
CF CG CI CM GA GN GW ML MR NE SN TD TG 

Publication Language: English 

Full text word Count: 8073 

Full text Availability: 
Detailed Description 

Detailed Description 

... field-corresponding terms from the related terms lists, so that a 
non-matching term within a given search field will only be compared to 
related terms of the same field. Thus, for example, a non... 

...For example, if an erroneous query is received which includes the 
matching term mountain within the title field 43, the spelling 
correction process 48 will search for a table entry having the keyword 
TMOUNTAIN . 

As further depicted in Figure 3, the correlation table 50 also 
preferably includes correlation scores 64 that indicate the number of 
times each related term has appeared in combination with the keyword 
. For example, term programming has a score of 320 in the entry for 
JAVA, indicating that JAVA and PROGRAMMING appeared... 
...correlations. As described below, the scores 64 are preferably used to 
merge related terms lists when a query has multiple matching terms. 



in operation, when the query server 38 determines that a query contains 
both. . . 
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AL AM AT AU A2 BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH 
GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW 
MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW 
GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK 
ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE 
SN TD TG 

Publication Language: English 

Full text Word Count: 7532 

Patent and Priority Information (Country, Number, Date): 

Patent: . . . 19990429 

Full text Availability: 

Detailed Description 
Publication Year: 1999 

Detailed Description 

type, Sprojectid, Nocation); 

2 Deleting a Object delete from Objects where 
Title= 'Stitle F. 

3 Adding a Link to an insert into Links 
Object (ObjectlD, Object2lD) 

VALUES 

(Sobjectid, $object2id); 

1 4 Search Project Profiles SELECT Projprof . ProjectiD, Projects. Title 

and count the number Count (Projects. Title ) 
of times a list of as [Count Of Title ] 

keywords appear in from Projects inner join Projprof ON 
Projects . ProjectiD 



each Project. = Projprof . ProjectlD 
WHERE 

(((projprof. Keyword ) = ' $ keywordl 1 

OR 

(Projprof .Keyword) ' $keyword2 1 
OR 

(Projprof .Keyword) 1 $keyword3 ')) 

GROUP BY Projprof . ProjectlD, Projects. Title ; 

1 5 Find out what Projects SELECT Projects. Title, workers. OwnerlD 
a User is currently from Projects inner join workers... 

.update linkstatus set url = '$urV,Title = 'Stitle' where 
user is working on. usef id= 'Suserid'; 

19 Find out what Pro jet a SELECT * FROM linkstatus where 
User is curently userid = 'Suserid'. Publish a new object... 
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AL AM AT AU A2 BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU 
IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO N2 PL 
PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW GH GM KE LS MW 
SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR 
IE IT LU MC NL PT SE BF S3 CF CG CI CM GA GN ML MR NE SN TD TG 

Publication Language: English 

Full text word Count: 12259 

Patent and Priority Information (Country, Number, Date): 

Patent: ... 19981223 

Full text Availability: 

Detailed Description 
Publication Year: 1998 

Detailed Description 
are. 

T - corpus, a set of all the documents or text at hand. 

A - corpus dictionary. A list of words/ phrases (hereafter, word an 

term will mean 
either words or phrases ) 
D - a document. 

a document vector ( w, , w2 1 W35 wn) , here wi is the number of 
occurrences of the i-th word in the dictionary that occur in D. 

C - a cluster, a set of documents which are classified together. The 
terms cluster/subcol lection/ category may be used interchangeably. 
They all mean a subcol lection of documents within the corpus. 
i@ - a cluster. . . 



...To simplify the discussion the notion I v I will be equivalent to vl. 

In order to find significant words in a cluster, it is necessary to 
determine whether differences in frequency of occurrence of... 
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METHOD AND APPARATUS FOR ELECTRONIC DISTRIBUTION OF DIGITAL MULTI-MEDIA 
INFORMATION 
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AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES Fl GB GE HU IS DP KE KG KP 
KR KZ LK LR LT LU LV MD MG MN MW MX NO NZ PL PT RO RU SD SE SG SI SK T3 
TM TT UA UG UZ VN KE MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC 
NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG 

Publication Language: English 

Full text Word Count: 4692 

Patent and Priority information (Country, Number, Date): 

Patent: ... 19960314 

Full text Availability: 

Detailed Description 
Publication Year: 1996 

Detailed Description 

... the remaining functions performed by network and segment producer 27. 

The network segment producer is responsible for retrieving segments 
from the multimedia file server, and transmitting them to the encoder 
multiplexed. This task is accomplished as follows. 

Based upon the segments which exists on the multimedia file server, build 
a list of segments which need to be transmitted. Prioritize the list 

considering the age of each segment , the number of times the 
segment has been transmitted previously, etc. 
Select the most urgent segment to transmit. 

Transmit the segment header information detailing the segment name , 
size, creation, etc. 

Simultaneously activate a control relay (indicating the start of the 
segment) and begin transmitting... 
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Claims 
Publication Year: 1995 

Claim 

correlation 
measures includes: 

deriving for each word of a pair a measure of the 
observed probability of finding that word in its 
respective corpus; 

deriving for each chosen pair of words a measure of 
the observed probability of finding that pair of words 
in aligned portions of the corpora; and 
combining the pair probability with the... 

...A method as claimed in any preceding claim 
wherein the statistical database comprises: 
for each corpus a table of word frequencies; 
for the aligned corpora as a whole a table of word 
pair frequencies, counting the number of times a given 
pair of words (one from each corpus) occurs in aligned 
portions of the corpora, 

14 A method as claimed in claim 13 wherein said 
each pair of text. . .correlation 
measures includes: 

deriving for each word of a pair a measure of the 
observed probability of finding that word in its 
respective corpus; 

deriving f or each chosen pair of words a measure of 
the observed probability of finding that pair of words 
in aligned portions of the corpora; and 
combining the pair probability with the... 

...claimed in any of claims 18 to 23 
wherein the statistical database comprises: 
for each corpus a table of word frequencies; 
for the aligned corpora as a whole, a table of word 
pair frequencies, counting the number of times a given 
pair of words (one from each corpus) occurs in aligned 
portions of the corpora. 



25 A method as claimed in claim 24 wherein said 
each pai r of text. . . 
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Claims 
Publication Year: 1982 

Claim 

... to the 

primary operation Tritmation, as detenminable by functional equlva® 
lenc7 tests, such as reduction to truth tables or to Post canonical 
forms ; 

(b) applying the string ©manipulating operation to the signals resulting 
from the preceding application of said operation; and 

(c) repeating step (b) a sufficient number of times for storing in 
said 

apparatus said representation of the recurrent cycle of strings for 
a finite duration, thereby enabling the retrieval of any portion 
thereof for pattern analysis. 

9 in data processing apparatus, the method of processing the signals 
representing the. . .operations and their functional equivalents, as deter® 
minable by functional equivalency tests, such as reduction to 
truth tables or to Post canonical forms; 

(b) applying the string -manipulating operation to the signals resulting 
from the preceding application of said operation; and 

(c) repeating step (b) a sufficient number of times for storing in 
said 

apparatus said representation of the recurrent cycle of strings for 
a finite duration, thereby enabling the retrieval of any portion 
thereof for pattern analysis. 



